C# Binary Search Tree

Hi Folks,

I thought I would post source code for a Binary Search Tree Algorithm that I recently worked on.

https://github.com/Romiko/Dev2

In a future post, I will post source code for a Balanced Binary Search Tree (Red Black Tree RBTree).

Remember with an unbalanced Binary Search Tree, if you insert data in order, it will actually resemble a linked list and not a binary tree, hence why it is important to use a Balanced Binary Search Tree.

Of course, you can use built in .NET framework types like SortedSet and SortedDictionary, but for those of you that want to write your own implementation, then this can be a good starting point.

Remember, the rules for a Binary Search Tree, which will also return data in a sorted order, using your IComparer implementation.

I also wrote a method to delete nodes from a Binary Search Tree, which will treats leaf nodes, nodes with one child and nodes with two children differently.

There is allot of articles out there on BST’s, but it is important, that if you want to understand them, that you must understand the rules governing the Insertion, Deletion and Retrieval algorithms.

You can find the source code, at my repository here:
https://github.com/Romiko/Dev2

You will notice that there is also numerous unit tests for the solution as well.

You will notice that the way I delete nodes, uses a Random dice, so not to Degenerate the Binary Tree, if it was balanced, this would not be necessary.

One more thing, I have avoided recursive functions, why? Well, with large binary trees, if you use recursive functions, the stack is going to run out of memory, so avoid them all together.

Here is some code snippets of important algorithms, if you have any questions, or want to contribute to the code, you are welcome to do so.


public void Add(TKey key, TValue value)
        {
            var newNode = new BinaryTreeNode { KeyValue = new KeyValuePair(key, value) };

            if (Root == null)
                Root = newNode;
            else
            {
                var current = Root;
                while (true)
                {
                    var compareResult = Comparer.Compare(key, current.KeyValue.Key);
                    if (compareResult == 0)
                        throw new ArgumentException("Duplicate key found.");

                    if (compareResult  0)
                        if (current.Right != null)
                            current = current.Right;
                        else
                        {
                            current.Right = newNode;
                            newNode.Parent = current;
                            break;
                        }
                }
            }
        }

private void Delete(BinaryTreeNode node)
        {
            var nodeType = GetNodeType(node);

            if (nodeType == NodeType.LeafeNode)
            {
                DeleteLeafNode(node);
                return;
            }
            if (nodeType == NodeType.HasOneChild)
            {
                DeleteNodeWithOneChild(node);
                return;
            }
            DeleteNodeWithTwoChildren(node);
        }

        private void DeleteNodeWithTwoChildren(BinaryTreeNode node)
        {
            // Use random delete method, to avoid degenerate tree structure.
            var replaceInOrderType = RollDiceForSuccessorOrPredecessor();

            if (replaceInOrderType == InOrderNode.Successor)
                UseSuccessor(node);
            else
                UsePredecessor(node);
        }

        private static void DeleteNodeWithOneChild(BinaryTreeNode node)
        {
            var theChild = node.Left ?? node.Right;
            theChild.Parent = node.Parent;
            switch (NodeLinkedToParentAs(node))
            {
                case NodeLinkToParentAs.Right:
                    node.Parent.Right = theChild;
                    break;
                case NodeLinkToParentAs.Left:
                    node.Parent.Left = theChild;
                    break;
                case NodeLinkToParentAs.Root:
                    node.Parent = null;
                    break;
            }
        }

        private static void DeleteLeafNode(BinaryTreeNode node)
        {
            if (NodeLinkedToParentAs(node) == NodeLinkToParentAs.Right)
                node.Parent.Right = null;
            else
                node.Parent.Left = null;
        }

        private void UsePredecessor(BinaryTreeNode node)
        {
            var replaceWith = ReadLastRightNode(node.Left);
            node.KeyValue = replaceWith.KeyValue;
            Delete(replaceWith);
        }

        private void UseSuccessor(BinaryTreeNode node)
        {
            var replaceWith = ReadLastLeftNode(node.Right);
            node.KeyValue = replaceWith.KeyValue;
            Delete(replaceWith);
        }

        private InOrderNode RollDiceForSuccessorOrPredecessor()
        {
            if (ForceDeleteType.HasValue)
                return ForceDeleteType.Value;

            var random = new Random();
            var choice = random.Next(0, 2);
            return choice == 0 ? InOrderNode.Predecessor : InOrderNode.Successor;
        }

https://github.com/Romiko/Dev2

Advertisements

Lucene Full Text Indexing with Neo4j

Hi Guys,

I spent some time working on full text search for Neo4j. The basic goals were as follows.

    • Control the pointers of the index
    • Full Text Search
    • All operations are done via Rest
    • Can create an index when creating a node
    • Can update and index
    • Can check if an index exists
    • When bootstrapping Neo4j in the cloud run Index checks
    • Query Index using full text search lucene query language.
Download:
This is based on Neo4jClient:
Source Code at:

Introduction

So with the above objectives, I decided to go with Manual Indexing. The main reason here is that I can put an index pointing to node A based on values in node B.

Imagine the following.

You have Node A with a list:

Surname, FirstName and MiddleName. However Node A also has a relationship to Node B which has other names, perhaps Display Names, Avatar Names and AKA’s.

So with manual indexing, you can have all the above entries for names in Node A and Node B point to Node A only.

So, in a Rest call to the Neo4j server, it would look something like this in Fiddler.

image

Notice the following:

Url: http://localhost:7474/db/data/index/node/{IndexName}/{Key}/{Value}

So, if we were adding 3 names for the SAME client from 2 different nodes. You would have the same IndexName and Key then with different values in the Url. The node pointer (In the request body) will then be the address to the Node.

Neo4jClient Nuget Package

I have updated the Neo4jClient which is on Nuget, to now support:

  • Creating Exact or FullText Indexes on it’s own, so that it just exists
  • Creating Exact or FullTest indexes when creating a node, the node reference will automatically be calculated.
  • Updating an Index
  • Deleting entries from an index.
    Class diagram for the indexing solution in Neo4jClient.

image

RestSharp

The Neo4jClient package uses RestSharp, thus making all the index call operations a trivial task for us, so lets have a look at some of the code inside the client to see how to consume manual index api from .Net, and then in the next section well look how we consume this code from another application.

 public Dictionary<string, IndexMetaData> GetIndexes(IndexFor indexFor)
        {
            CheckRoot();

            string indexResource;
            switch (indexFor)
            {
                case IndexFor.Node:
                    indexResource = RootApiResponse.NodeIndex;
                    break;
                case IndexFor.Relationship:
                    indexResource = RootApiResponse.RelationshipIndex;
                    break;
                default:
                    throw new NotSupportedException(string.Format("GetIndexes does not support indexfor {0}", indexFor));
            }

            var request = new RestRequest(indexResource, Method.GET)
            {
                RequestFormat = DataFormat.Json,
                JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
            };

            var response =  client.Execute<Dictionary<string, IndexMetaData>>(request);

            if (response.StatusCode != HttpStatusCode.OK)
                throw new NotSupportedException(string.Format(
                    "Received an unexpected HTTP status when executing the request.\r\n\r\n\r\nThe response status was: {0} {1}",
                    (int)response.StatusCode,
                    response.StatusDescription));

            return response.Data;
        }

        public bool CheckIndexExists(string indexName, IndexFor indexFor)
        {
            CheckRoot();

            string indexResource;
            switch (indexFor)
            {
                case IndexFor.Node:
                    indexResource = RootApiResponse.NodeIndex;
                    break;
                case IndexFor.Relationship:
                    indexResource = RootApiResponse.RelationshipIndex;
                    break;
                default:
                    throw new NotSupportedException(string.Format("IndexExists does not support indexfor {0}", indexFor));
            }

            var request = new RestRequest(string.Format("{0}/{1}",indexResource, indexName), Method.GET)
            {
                RequestFormat = DataFormat.Json,
                JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
            };

            var response = client.Execute<Dictionary<string, IndexMetaData>>(request);

            return response.StatusCode == HttpStatusCode.OK;
        }

        void CheckRoot()
        {
            if (RootApiResponse == null)
                throw new InvalidOperationException(
                    "The graph client is not connected to the server. Call the Connect method first.");
        }

        public void CreateIndex(string indexName, IndexConfiguration config, IndexFor indexFor)
        {
            CheckRoot();

            string nodeResource;
            switch (indexFor)
            {
                case IndexFor.Node:
                    nodeResource = RootApiResponse.NodeIndex;
                    break;
                case IndexFor.Relationship:
                    nodeResource = RootApiResponse.RelationshipIndex;
                    break;
                default:
                    throw new NotSupportedException(string.Format("CreateIndex does not support indexfor {0}", indexFor));
            }

            var createIndexApiRequest = new
                {
                    name = indexName.ToLower(),
                    config
                };

            var request = new RestRequest(nodeResource, Method.POST)
                {
                    RequestFormat = DataFormat.Json,
                    JsonSerializer = new CustomJsonSerializer {NullHandling = JsonSerializerNullValueHandling}
                };
            request.AddBody(createIndexApiRequest);

            var response = client.Execute(request);

            if (response.StatusCode != HttpStatusCode.Created)
                throw new NotSupportedException(string.Format(
                    "Received an unexpected HTTP status when executing the request..\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
                    indexName,
                    (int) response.StatusCode,
                    response.StatusDescription));
        }

        public void ReIndex(NodeReference node, IEnumerable<IndexEntry> indexEntries)
        {
            CheckRoot();

            var nodeAddress = string.Join("/", new[] {RootApiResponse.Node, node.Id.ToString()});

            var updates = indexEntries
                .SelectMany(
                    i => i.KeyValues,
                    (i, kv) => new {IndexName = i.Name, kv.Key, kv.Value});

            foreach (var update in updates)
            {
                if (update.Value == null)
                    break;

                string indexValue;
                if(update.Value is DateTimeOffset)
                {
                    indexValue = ((DateTimeOffset) update.Value).UtcTicks.ToString();
                }
                else if (update.Value is DateTime)
                {
                    indexValue = ((DateTime)update.Value).Ticks.ToString();
                }
                else
                {
                    indexValue = update.Value.ToString();
                }

                AddNodeToIndex(update.IndexName, update.Key, indexValue, nodeAddress);
            }
        }

        public void DeleteIndex(string indexName, IndexFor indexFor)
        {
            CheckRoot();

            string indexResource;
            switch (indexFor)
            {
                case IndexFor.Node:
                    indexResource = RootApiResponse.NodeIndex;
                    break;
                case IndexFor.Relationship:
                    indexResource = RootApiResponse.RelationshipIndex;
                    break;
                default:
                    throw new NotSupportedException(string.Format("DeleteIndex does not support indexfor {0}", indexFor));
            }

            var request = new RestRequest(string.Format("{0}/{1}", indexResource, indexName), Method.DELETE)
            {
                RequestFormat = DataFormat.Json,
                JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
            };

            var response = client.Execute(request);

            if (response.StatusCode != HttpStatusCode.NoContent)
                throw new NotSupportedException(string.Format(
                    "Received an unexpected HTTP status when executing the request.\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
                    indexName,
                    (int)response.StatusCode,
                    response.StatusDescription));
        }

        void AddNodeToIndex(string indexName, string indexKey, string indexValue, string nodeAddress)
        {
            var nodeIndexAddress = string.Join("/", new[] { RootApiResponse.NodeIndex, indexName, indexKey, indexValue });
            var request = new RestRequest(nodeIndexAddress, Method.POST)
            {
                RequestFormat = DataFormat.Json,
                JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
            };
            request.AddBody(string.Join("", client.BaseUrl, nodeAddress));

            var response = client.Execute(request);

            if (response.StatusCode != HttpStatusCode.Created)
                throw new NotSupportedException(string.Format(
                    "Received an unexpected HTTP status when executing the request.\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
                    indexName,
                    (int)response.StatusCode,
                    response.StatusDescription));
        }

        public IEnumerable<Node<TNode>> QueryIndex<TNode>(string indexName, IndexFor indexFor, string query)
        {
            CheckRoot();

            string indexResource;

            switch (indexFor)
            {
                case IndexFor.Node:
                    indexResource = RootApiResponse.NodeIndex;
                    break;
                case IndexFor.Relationship:
                    indexResource = RootApiResponse.RelationshipIndex;
                    break;
                default:
                    throw new NotSupportedException(string.Format("QueryIndex does not support indexfor {0}", indexFor));
            }

            var request = new RestRequest(indexResource + "/" + indexName, Method.GET)
                {
                    RequestFormat = DataFormat.Json,
                    JsonSerializer = new CustomJsonSerializer {NullHandling = JsonSerializerNullValueHandling}
                };

            request.AddParameter("query", query);

            var response = client.Execute<List<NodeApiResponse<TNode>>>(request);

            if (response.StatusCode != HttpStatusCode.OK)
                throw new NotSupportedException(string.Format(
                    "Received an unexpected HTTP status when executing the request.\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
                    indexName,
                    (int) response.StatusCode,
                    response.StatusDescription));

            return response.Data == null
           ? Enumerable.Empty<Node<TNode>>()
           : response.Data.Select(r => r.ToNode(this));
        }
		

Using the Neo4jClient from within an application

Create an Index and check if it exists

This is useful when bootstrapping Neo4j, to see if there are any indexes that SHOULD be there and are not, so that you can enumerate all the nodes for that index and add entries.

public void CreateIndexesForAgencyClients()
        {
            var agencies = graphClient
                .RootNode
                .Out<Agency>(Hosts.TypeKey)
                .ToList();

            foreach (var agency in agencies)
            {
                var indexName = IndexNames.Clients(agency.Data);
                var indexConfiguration = new IndexConfiguration
                    {
                        Provider = IndexProvider.lucene,
                        Type = IndexType.fulltext
                    };

                if (!graphClient.CheckIndexExists(indexName, IndexFor.Node))
                {
                    Trace.TraceInformation("CreateIndexIfNotExists {0} for Agency Key {0}", indexName, agency.Data.Key);
                    graphClient.CreateIndex(indexName, indexConfiguration, IndexFor.Node);
                    PopulateAgencyClientIndex(agency.Data);
                }
            }
        }

Create an Index Node Entry when creating a node

 var indexEntries = GetIndexEntries(agency.Data, client, clientViewModel.AlsoKnownAses);

var clientNodeReference = graphClient.Create(
                client,
                new[] {new ClientBelongsTo(agencyNode.Reference)}, indexEntries);

public IEnumerable<IndexEntry> GetIndexEntries(Agency agency, Client client, IEnumerable<AlsoKnownAs> alsoKnownAses)
        {
            var indexKeyValues = new List<KeyValuePair<string, object>>
            {
                new KeyValuePair<string, object>(AgencyClientIndexKeys.Gender.ToString(), client.Gender)
            };

            if (client.DateOfBirth.HasValue)
            {
                var dateOfBirthUtcTicks = client.DateOfBirth.Value.UtcTicks;
                indexKeyValues.Add(new KeyValuePair<string, object>(AgencyClientIndexKeys.DateOfBirth.ToString(), dateOfBirthUtcTicks));
            }

            var names = new List<string>
            {
                client.GivenName,
                client.FamilyName,
                client.PreferredName,
            };

            if (alsoKnownAses != null)
            {
                names.AddRange(alsoKnownAses.Where(a => !string.IsNullOrEmpty(a.Name)).Select(aka => aka.Name));
            }

            indexKeyValues.AddRange(names.Select(name => new KeyValuePair<string, object>(AgencyClientIndexKeys.Name.ToString(), name)));

            return new[]
            {
                new IndexEntry
                {
                    Name = IndexNames.Clients(agency),
                    KeyValues = indexKeyValues.Where(v => v.Value != null)
                }
            };
        }
		

Reindex a node

Notice there was a call to PopulateAgencyClientIndexin in the code, this is done in our bootstrap to ensure indexes are always there as expected, and if for some reason they are not, then they created and populated by using reindex feature.

void PopulateAgencyClientIndex(Agency agency)
        {
            var clients = graphClient
                .RootNode
                .Out<Agency>(Hosts.TypeKey, a => a.Key == agency.Key)
                .In<Client>(ClientBelongsTo.TypeKey);

            foreach (var client in clients)
            {
                var clientService = clientServiceCallback();
                var akas = client.Out<AlsoKnownAs>(IsAlsoKnownAs.TypeKey).Select(a => a.Data);
                var indexEntries = clientService.GetIndexEntries(agency, client.Data, akas);
                graphClient.ReIndex(client.Reference, indexEntries);
            }
        }
		

Querying a full text search index using Lucene

Below is sample code to query full text search. Basically your index entries for a person with

Name: Bob, Surname:Van de Builder, Aka1: Bobby, Aka2: Bobs, PrefferedName: Bob The Builder

The index entries will need to look like the

Key:Value
Name: Bob
Name:Van
Name:de
Name: Builder
Name: Bobby
Name: Bobs

Remember, Lucene has a white space analyser, so any names with spaces MUST become a new index entry, so what we do is split out names based on whitespaces and this becomes our collection of IndexEntries. The above is related to full text search context.

Note: If using EXACT Index match, then composite entries are needed for multiple words, since you no longer using lucene full text search capabilities. e.g.

Name: Bob The Builder

This is good to know, because things like postal code searches or Gender where exact matches are required do not need full text indexes.

Lets check out an example of querying an index.

        [Test]
        public void VerifyWhenANewClientIsCreateThatPartialNameCanBeFuzzySearchedInTheFullTextSearchIndex()
        {
            using (var agency = Data.NewTestAgency())
            using (var client = Data.NewTestClient(agency, c =>
            {
                c.Gender = Gender.Male;
                c.GivenName = "Joseph";
                c.MiddleNames = "Mark";
                c.FamilyName = "Kitson";
                c.PreferredName = "Joey";

                c.AlsoKnownAses = new List<AlsoKnownAs>
                    {
                       new AlsoKnownAs {Name = "J-Man"},
                       new AlsoKnownAs {Name = "J-Town"}
                    };
            }
                ))
            {
                var indexName = IndexNames.Clients(agency.Agency.Data);
                const string partialName = "+Name:Joe~+Name:Kitson~";
                var result = GraphClient.QueryIndex<Client>(indexName, IndexFor.Node, partialName);
                Assert.AreEqual(client.Client.Data.UniqueId, result.First().Data.UniqueId);
            }
        }
		

Dates

Notice that in some of the code, you may have noticed that when I store date entries in the index, I store them as Ticks, so this will be as long numbers, this is awesome, as it gives raw power to searching dates via longs Smile

 [Test]
        public void VerifyWhenANewClientIsCreateThatTheDateOfBirthCanBeRangeSearchedInTheFullTextSearchIndex()
        {
            // Arrange
            const long dateOfBirthTicks = 634493518171556320;
            using (var agency = Data.NewTestAgency())
            using (var client = Data.NewTestClient(agency, c =>
            {
                c.Gender = Gender.Male;
                c.GivenName = "Joseph";
                c.MiddleNames = "Mark";
                c.FamilyName = "Kitson";
                c.PreferredName = "Joey";
                c.DateOfBirth = new DateTimeOffset(dateOfBirthTicks, new TimeSpan());
                c.CurrentAge = null;
                c.AlsoKnownAses = new List<AlsoKnownAs>
                    {
                       new AlsoKnownAs {Name = "J-Man"},
                       new AlsoKnownAs {Name = "J-Town"}
                    };
            }
                ))
            {
                // Act
                var indexName = IndexNames.Clients(agency.Agency.Data);
                var partialName = string.Format("DateOfBirth:[{0} TO {1}]", dateOfBirthTicks - 5, dateOfBirthTicks + 5);
                var result = GraphClient.QueryIndex<Client>(indexName, IndexFor.Node, partialName);
                // Assert
                Assert.AreEqual(client.Client.Data.UniqueId, result.First().Data.UniqueId);
            }
        }
		

Summary

Well, I hope you found this post useful. Neo4jClientis on nuget, so have a bash using it and would love to know your feedback.

Download

NuGetPackage:
Source Code at:

Cheers

WCF Architecture/Extensibility Overview

Hi Guys,

This post is going to discuss some basic high level aspects of WCF. Below is a diagram of the architecture for it.

Some high level facts.

  • Message Contract is the structure of an actual WCF Message, it describes the nature of the message
  • A Data Contract is the PAYLOAD or actual data, which is embedded into the Message.
  • The Service runtime is primarily concerned with processing the content of message bodies
    • The message layer is concerned with “channels” and channel stacks (More than one channel)
      There are two types of channels
      Protocol Channel – Message Header Management – WS-Security/WS-Reliability
      Transport Channel – How data is communicated/translated/encoded/decoded on the wire. Http, Netmsmq
    • Hosting – WCF can be hosted in a Windows Service, Executable, IIS WAS or IIS. You can even run it inside a NServiceBus host if you wanted.

We all know the ABC’s of WCF.

A service will need an Address, Binding and a Contract. But there is allot more to WCF than meat the eye.

Behaviors
Control various run-time aspects of a service, an endpoint, a particular operation, or a client. You have common behaviors affect all endpoints globally,
Service behaviors affect only service-related aspects,
Endpoint behaviors affect only endpoint-related properties, and
Operation-level behaviors affect particular operations.

e.g. One service behavior is throttling, which specifies how a service reacts when an excess of messages threaten to overwhelm the system. An endpoint behavior, such as where to find a security credential.

In regards to service behaviours, one aspect that is overlook is Instance and Concurrency modes. Read more about it further down in this article.

Instances and Concurrency

This is often overlook, always be aware of how you write your WCF service and ensure the code is thread safe and can handle multiple instances and concurrency aspects, else you might find your WCF services not scalable! These are things you should always think about BEFORE your write the service. You should read this article to get a better understanding of it.

http://msdn.microsoft.com/en-us/library/ms731193.aspx

Have a read, and ensure you classes etc are thread safe so they can scale, no shared variables etc in your WCF code that maintain a state at the service layer, you will find yourself in deep water. You can use sessions, instances or concurrency mode combinations to control these aspects.

Here is an interesting  example of customizing this option, which should get you thinking about how you combine these sort of behavioural modes!

 [ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall, ConcurrencyMode = ConcurrencyMode.Single)]
    public class ProductsService : IProductsService
    {
      //Your Service Logic
     }

Per-call services are the Windows Communication Foundation default instantiation mode. When the service type is configured for per-call activation, a service instance, a common language runtime (CLR) object, exists only while a client call is in progress. Every client request gets a new dedicated service instance. This keeps the lifetime of objects short as possible.

  1. The client calls the proxy and the proxy forwards the call to the service.
  2. Windows Communication Foundation creates a service instance and calls the method on it.
  3. After the method call returns, if the object implements IDisposable, then Windows Communication Foundation calls IDisposable.Dispose on it.

Single: Each instance context is allowed to have a maximum of one thread processing messages in the instance context at a time. Other threads wishing to use the same instance context must block until the original thread exits the instance context.

Can you see something here, this combination is irrelevant? Since PerCall is here,the proxy will never allow multiple threads, since the proxy will know there is an instance already, so some combinations will never need to be explicit, such as the redundant attributes in the above code.

Dispatcher Runtime And Client Runtime

What if you want to customize Wcf? How about introducing a custom Encoding/Decoding Algorithm or custom compression/validation system or a custom error handler for legacy systems?

Lets get even more fancy, how about a custom instance provider that can hydrate and dehydrate WCF instances to and from a database for  long running transactions, similar to the idea of Saga’s in NServiceBus…

This can all be done on the client runtime or dispatcher on the service.

There is allot going on in WCF and there is several posts on making a Hello World WCF service, lets skip all that and get down to extensibility of WCF. We will focus on the dispatcher and message inspectors. Lets check what we can do firstly on the client side and then on the server side.

Here is an overview of the architecture.

image

From the above diagram you can see that WCF is very extensible, there are hooks in the architecture where you can extend the functionality of WCF.

The  client runtime is responsible for translating method invocations into outbound messages, pushing them to the underlying channels, and translating results back into return values and out parameters.

This runtime model presents different service model extensions to modify or implement execution or communication behavior and features client or dispatcher functionality such as message and parameter interception, operation selection, message encoding and other extensibility functionality.

In the service, the dispatcher runtime is responsible for pulling incoming messages out of the underlying channels, translating them into method invocations in application code, and sending the results back to the caller. This runtime model presents different service model extensions to modify or implement execution or communication behavior and features client or dispatcher functionality such as message and parameter interception, message filtering, encoding and other extensibility functionality.

There is numerous examples here:

http://msdn.microsoft.com/en-us/library/ff183867.aspx

Here is an example of a WCF Service using a Behaviour for JSON Serialization. Notice the different level of behaviours from EndPoint Behaviours/Service Behavoirs etc, also notice we have a JSON which is done in the EndPoint Behaviour. Also notice the bindings, we have different types for different clients, .Net can use the webHttpBinding and Java clients can use the BasicHttpBinding.

<system.serviceModel>
    <behaviors>
      <serviceBehaviors>
        <behavior name="BaseBehaviors">
          <serviceDebug includeExceptionDetailInFaults="true" />
          <serviceMetadata httpGetEnabled="True" httpsGetEnabled="True" httpGetUrl="Products/GetList" httpsGetUrl="Products/GetList" />
        </behavior>
      </serviceBehaviors>
      <endpointBehaviors>
        <behavior name="BaseHttpEndpointBehavior">
        </behavior>
        <behavior name="jsonBehavior">
          <enableWebScript  />
        </behavior>
      </endpointBehaviors>
    </behaviors>

    <serviceHostingEnvironment aspNetCompatibilityEnabled="false" />

    <services>
      <service behaviorConfiguration="BaseBehaviors" name="Romiko.MyService">
        <endpoint name="ProductService" address="Products" behaviorConfiguration="BaseHttpEndpointBehavior"
          binding="basicHttpBinding" bindingConfiguration=""
          contract="Romiko.IProductsService" />
        <endpoint name="ProductServiceSSL" address="ProductsSSL" behaviorConfiguration="BaseHttpEndpointBehavior"
          binding="basicHttpBinding" bindingConfiguration="SecureSSL"
          contract="Romiko.IProductsService">
        </endpoint>
        <endpoint name="ProductsServiceJSON" address="ProductsJSON" behaviorConfiguration="jsonBehavior"
                  binding="webHttpBinding" bindingConfiguration=""
                  contract="Romiko.IProductsProductsService" />
        <endpoint name="ProductsServiceJSONSSL" address="ProductsJSONSSL" behaviorConfiguration="jsonBehavior"
                  binding="webHttpBinding" bindingConfiguration="SecureSSLWeb"
                  contract="Romiko.IProductsService">
        </endpoint>
        <host>
          <baseAddresses>
            <add baseAddress="http://localhost/Products" />
            <add baseAddress="https://localhost:443/Products" />
          </baseAddresses>
        </host>
      </service>
    </services>
    
    
    <bindings>
      <basicHttpBinding>
        <binding name="SecureSSL">
          <security mode="Transport">
            <transport clientCredentialType="None"/>
          </security>
        </binding>
      </basicHttpBinding>
      <webHttpBinding>
        <binding name="SecureSSLWeb">
          <security mode="Transport">
            <transport clientCredentialType="None"/>
          </security>
        </binding>
      </webHttpBinding>
    </bindings>
  </system.serviceModel>

 

Well, I hope this helps you get your toes a little deeper into WCF, so the next time you write a WCF service you can nut out all the architectural principles BEFORE writing the code.

References:

http://msdn.microsoft.com/en-us/library/ff183867.aspx

http://msdn.microsoft.com/en-us/library/ms733128.aspx

PayPal Payment Standard IPN/PDT–Asynchronous Processing

Hi Guys,

I am currently working on a personnel project to interface with PayPal Payment Standard on an MVC3 and Windows Azure based application. I needed to find a nice way to convert request/response objects from PayPal to their corresponding IPN/PDT objects.

I want a payment solution that can be easily scaled on high load. So we will leverage NServiceBus as the front end –> back end service bus infrastructure. This allows decoupling of payments from the functionality of the site, so under high load the payments will not restrict the usability of the site, This is done with MSMQ and NServicebus. I will show a basic example of how.

Below is a DTO you can use for IPN or PDT data. Of course you will need logic to convert http request/response data to this DTO and you have many ways of doing so.

More information about PDT/IPN variables can be found here:

https://www.paypal.com/IntegrationCenter/ic_ipn-pdt-variable-reference.html
 https://cms.paypal.com/us/cgi-bin/?cmd=_render-content&content_ID=developer/e_howto_html_IPNandPDTVariables#id091EB0901HT

You can then take and IPN and convert the request form/query string values  to a DTO e.g.

  _myIpnNotification.Invoice = _myRequest["invoice"];

The same applies for the PDT which you will receive out of band on a separate listener.

 

Here are the different type of transaction types and status

 [Serializable]
    public enum PdtStatus
    {
        Unknown = 0,

        [Description("SUCCESS")] Success,
        [Description("FAIL")] Fail
    }

  [Serializable]
    public enum IpnStatus
    {
        Unknown,
        [Description("verified")] Verified,
        [Description("invalid")] Invalid
    }

    [Serializable]
    public enum TransactionType
    {
        //[Description(null)]
        Unknown = 0,

        [Description("cart")] Cart,

        [Description("express_checkout")] ExpressCheckout,

        [Description("merch_pmt")] MerchantPayment,

        [Description("send_money")] SendMoney,

        [Description("virtual_terminal")] VirtualTerminal,

        [Description("web_accept")] WebAccept,

        [Description("masspay")] MassPayment,

        [Description("subscr_signup")] SubscriptionSignUp,

        [Description("subscr_cancel")] SubscriptionCancellation,

        [Description("subscr_failed")] SubscriptionPaymentFailed,

        [Description("subscr_payment")] SubscriptionPayment,

        [Description("subscr_eot")] SubscriptionEndOfTime,

        [Description("subscr_modify")] SubscriptionModification
    }

Security, if you want the best security, never trust anything from Paypal, always double check the data, so when an IPN comes in, take that data and do a check with Payal and then verify the results match, and the same with PDT data.

This means checking the IPN data – currency, amount, date, subscription type etc is the same as what is in your transaction log, this way, any spoofing is protected, here is an example of a rule.

 public interface IRulesCommon
    {
        List<Error> ApplyRulesStandardPayments( decimal resultAmount, Currency resultCurrency, string receiver, TransactionLog transaction);
    }
 public interface IRulesPaypal
    {
        List<Error> ComparePaymentFrequencyTransactionStatus(TransactionLog transaction, TransactionType transactionType);

        //Used by IPN Only for SignUp
        List<Error> CheckPaymentFrequencyOnSignupIpn(TransactionLog transaction, TimePeriod subscriptionPeriod);

        List<Error> CheckSubscriptionPayment(decimal mcAmount3, Currency currency, string receiver,
                                             TransactionLog transaction);



        List<Error> CheckSubscriptionSignUpCancel(decimal mcAmount3, string recurring, Currency currency, string receiver,
                                     TransactionLog transaction);
    }

 

Unfortunately, I cannot show you implementation logic for this, as it may breach the security of my site if something is found which can be compromised, however this interface is a good start.

As you can see above, I do allot of checks, I ensure the amount, currency, receiver all match the original transaction object.

So, now we have an IPN or PDT listener, so say I have an MVC3 listener controller like so:

 public class IpnController : Controller
    {
        public readonly ILog Log = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType);
        readonly IBus _bus; //NServiceBus Object
        readonly IIpnDataProvider _ipnDataProvider;
        readonly ISettingsDataProvider _settings;

        
        //Constructor Inject with Autofac
        public IpnController(IBus bus, IIpnDataProvider ipnDataProvider, ISettingsDataProvider settings)
        {
            _bus = bus;
            _settings = settings;
            _ipnDataProvider = ipnDataProvider;
        }

        /// <summary>
        /// Expects a post with variables like: ?mc_gross=19.95&protection_eligibility=Eligible&address_status=confirmed&payer_id=LPLWNMTBWMFAY&tax=0.00&address_street=1+Main+St&payment_date=20%3A12%3A59+Jan+13%2C+2009+PST&payment_status=Completed&charset=windows-1252&address_zip=95131&first_name=Test&mc_fee=0.88&address_country_code=US&address_name=Test+User&notify_version=2.6&custom=1||1||myredirecturl&payer_status=verified&address_country=United+States&address_city=San+Jose&quantity=1&verify_sign=AtkOfCXbDm2hu0ZELryHFjY-Vb7PAUvS6nMXgysbElEn9v-1XcmSoGtf&payer_email=gpmac_1231902590_per%40paypal.com&txn_id=61E67681CH3238416&payment_type=instant&last_name=User&address_state=CA&receiver_email=gpmac_1231902686_biz%40paypal.com&payment_fee=0.88&receiver_id=S8XGHLYDW9T3S&txn_type=express_checkout&item_name=&mc_currency=USD&item_number=&residence_country=US&test_ipn=1&handling_amount=0.00&transaction_subject=&payment_gross=19.95&shipping=0.0
        /// </summary>
        /// <returns></returns>
        public ActionResult Ipn()
        {
            string rawData = Extract.GetHttpRawData(Request);

            if (_settings.AuditHttpEnabled)
                Audit.AuditHttp(rawData, _bus, Request.RawUrl); //Asynchronous auditing of the IPN message for audit tracking

            if (_ipnDataProvider != null && _ipnDataProvider.MandatoryDataSpecified)
            {
                try
                {
                    var message = new PaypalIpnMessage
                        {
                            OriginalHttpRequest =
                                rawData,
                            MessageId = Guid.NewGuid(),
                            InvoiceId = _ipnDataProvider.TransactionId,
                            Notification =
                                new ResponseToIpnNotification(Request.Form).GetIpnNotification()
                        };
                    _bus.Send(message); //Asynchronous processing of the payment message to the backend systems
                }
                catch (Exception e)
                {
                    Log.Info(e);
                    Log.Info(rawData);
                    return new HttpStatusCodeResult(400);
                }
                return View();
            }

            Log.Info("Request data does not contain mandatory fields e.g. MerchantId and MerchantTransactionId");
            Log.Info(rawData);
            return new HttpStatusCodeResult(“_codeResult”);
        }
    }

 

As you can see above, any payment processing is now totally decoupled, the Front End and Back End work independent of each other with no synchronous calls. Yes this is all wrapped within a MSDTC transaction. I prefer this than using WCF, due to no issue in dealing with latency and I have guaranteed delivery of my crucial message.,You will write a similar controller for the PDT out of band response as well. We also leveraging an IoC container to automatically inject the service bus object into the controller as well as other dependencies. My favourite is Autofac.

 public ActionResult Process()
        {
            string rawData = Extract.GetHttpRawData(Request);
            if (_settings.AuditHttpEnabled)
                Audit.AuditHttp(rawData, _bus, Request.RawUrl);

            if (_successDataProvider != null && _successDataProvider.MandatoryDataSpecified)
            {
                try
                {
                    var message = new PaypalPdtMessage
                        {
                            OriginalHttpRequest =rawData,
                            MessageId = Guid.NewGuid(),
                            TransactionIdForPdtToken = _successDataProvider.TransactionIdForPdtToken,
                            AmountPaidRecordedByPdt = _successDataProvider.AmountPaidRecordedByPdt,
                            InvoiceId = _successDataProvider.TransactionId,
                            
                        };

                    _bus.Send(message);
                   
                }
                catch (Exception e)
                {
                    Log.Info(e);
                    Log.Info(rawData);
                }

                return Redirect(_successDataProvider.MerchantRedirectURL);
            }


            Log.Info("Request data does not contain mandatory fields e.g. MerchantId");
            Log.Info(rawData);
            return new HttpStatusCodeResult(“_codeResult”);
        }
    }

I hope this gives you some ideas for developing robust payment options for your site and provide a nice user experience for users Smile

Remember to also deal with your dates in UTC format, and ensure the date kind property is set to UTC as well for extra safety when storing the transaction date.

So in a nutshell, you can have top notch validation that protects from spoofing attacks if you ALWAYS take a IPN and PDT and compare it with the original transaction object! If you keep your transaction id’s unpredictable, users can never guess someone else  transactionid and hijack it for payment. So you can have a nice invoice number e.g. 90124, but the transactionid is not easy to predict, it could be a Guid etc.

 

Imagine if the above was not the case a smart user could then create two transactions, one for a cheap item and one for an expensive one, here can then hijack the ipn/pdt, or send a FAKE pdt/ipn to your system, and then swap the item information around, and then later cancel the expensive item transaction, however, he hijacked the cheap item and changed the item list and transactionid, hence why I say, check AMOUNT, CURRENCY, Items, TranactionID etc. There are allot of sites out there that are easy to hack due to them not doing these sort of double checks on Ipn and Pdt messages from Paypal. If I know you IPN or PDT listener, I can send fake messages in and try guess weaknesses by using methods such as generating two transactions on the site and trying to swap items around, currencies, etc In fact, because they cannot guess my transaction number, I do not need to check items in the list, as this is not processed in my listener, so if they changed a flower pot to a BMW, who cares, I ignore this sort of data in an IPN/PDT, as long as the fundamentals are the same we in good shape.

Matthew Will and I spent allot of time think these sort of scenarios through and nutted out a nice secure solution. The above samples should get you started in the right direction for Paypal Standard integration. Allot of implementation logic is left out, and this is done on purpose.

Cheers

Linq to SQL Anti-Patterns–Dealing with nullable types

Hi Guys,

This blog will demonstrate some bad habits in LinqToSql and how to deal with nullable types that provide clean code without null checks all over and improved performance on the projections.

In one of my posts I recommended and demonstrated value of using NHProf, well the same goes for Linq to SQL. You can download a trial here:

http://l2sprof.com/

It is really easy to use, the the case of ASP.NET, just add a global.asax or edit an existing one’s code behind like so:

public class Global : System.Web.HttpApplication
    {

        protected void Application_Start(object sender, EventArgs e)
        {
            try
            {

                var profileOn = false;
                bool.TryParse(ConfigurationManager.AppSettings["EnableLinqToSQLProfile"], out profileOn);
                if (profileOn)
                    HibernatingRhinos.Profiler.Appender.LinqToSql.LinqToSqlProfiler.Initialize();
            }
            catch (Exception)
            {
               Debug.WriteLine("Could not initialize LinqToSql Profiling");
            }
        }

Excellent, then add a reference to the HibernatingRhinos.Profiler.Appender.dll which I guess can be in your lib folder.

Right, now lets have a look at some customer code that is causing 2 hits to the database, which we can reduce it to one hit.

Old Code causing 2 hits:

 public int GetCaseStudyIDByPageID(int pageID)
        {
            var result = _db.CaseStudies.Where(i => i.PageID == pageID);

            if (result.Any())
            {
                return result.First().CaseStudyId;
            }

            return -1;
        }

So from above, if we attach the L2SQL profiler, we will see 2 exact same queries going to the DB, one for the .Any() and then the other for the c.First.

image

Useful Query: (Note the extra columns we do not need, we come back to this, as I do not like the projection here, too many columns, I just need the PageID!!)

SELECT TOP ( 1 ) [t0].[CaseStudyId],
                 [t0].[PageID],
                 [t0].[Title],
                 [t0].[ShortDescription],
                 [t0].[LongDescription],
                 [t0].[Challenge],
                 [t0].[Solution],
                 [t0].[Results],
                 [t0].[ImageURL],
                 [t0].[Rank],
                 [t0].[Visible],
                 [t0].[ModifiedById],
                 [t0].[DateCreated],
                 [t0].[DateModified]
FROM   [dbo].[CaseStudy] AS [t0]
WHERE  [t0].[PageID] = 128 /* @p0 */

So the extra query is:

SELECT (CASE 
          WHEN EXISTS (SELECT NULL AS [EMPTY]
                       FROM   [dbo].[CaseStudy] AS [t0]
                       WHERE  [t0].[PageID] = 128 /* @p0 */) THEN 1
          ELSE 0
        END) AS [value]

Lets Improve it:

        public int GetCaseStudyIDByPageID(int pageID)
        {
            var result = _db.CaseStudies.FirstOrDefault(i => i.PageID == pageID);
            return result == null ? -1 : result.CaseStudyId;
        }

Now in the profiler, we will see only 1 statement being executed, as we removed the Any() extension method.

image

The same goes for counts

Old code – 2 queries to the DB:

public int GetCaseStudyIDByPageID(int pageID)
        {
            var r = from I in DB.CaseStudies
                    where I.PageID == pageID
                    select I.CaseStudyId;

            return r.Count() == 0 ? -1 : r.First();
        }

 

New optimised code:

      public int GetCaseStudyIDByPageID(int pageID)
        {
            var result = _db.CaseStudies.FirstOrDefault(i => i.PageID == pageID);
            return result == null ? -1 : result.CaseStudyId;
        }

 

Let’s improve it further by using projections to reduce the number of columns coming back, if you look at the result, we get all the columns back, which is extra data over the wire:

        public int GetCaseStudyIDByPageID(int pageID)
        {
            var result = _db.CaseStudies.Where(z => z.PageID == pageID).Select(z => (int?)z.CaseStudyId).FirstOrDefault();
            return result ?? -1;

        }
SELECT TOP ( 1 ) [t1].[value]
FROM   (SELECT [t0].[CaseStudyId] AS [value],
               [t0].[PageID]
        FROM   [dbo].[CaseStudy] AS [t0]) AS [t1]
WHERE  [t1].[PageID] = 128 /* @p0 */

This is much less columns being returned.

image

However, we can improve this further by creating a method to handle this for us, how about something along the lines of NullableFirstOrDefault….

        public int GetCaseStudyIDByPageID(int pageID)
        {
            var result = NullableFirstOrDefault(_db.CaseStudies.Where(z => z.PageID == pageID).Select(z => z.CaseStudyId));
            return result ?? -1;

        }


        public Nullable<T> NullableFirstOrDefault<T>(IQueryable<T> input) where T : struct
        {
            return input.Select(z => (Nullable<T>)z).FirstOrDefault();
        }

Now, we get the same result, with limited projection over the wire, but we can then use this as an extension method to optimise all nullable  first or defaults.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ROMIKO.NET.LINQ.Extensions
{
    public static class LinqToSqlExtension
    {


        public static Nullable<T> NullableFirstOrDefault<T>(this IQueryable<T> input) where T : struct
        {
            return input.Select(z => (Nullable<T>)z).FirstOrDefault();
        }

        public static IQueryable<Nullable<T>> SelectNullable<T>(this IQueryable<T> input) where T : struct
        {
            return input.Select(z => (Nullable<T>)z);
        }
    }
}

and now, we can use it like this, without any null checks

        public int GetCaseStudyIDByPageID(int pageID)
        {
            var result = _db.CaseStudies.Where(z => z.PageID == pageID).Select(z => z.CaseStudyId).NullableFirstOrDefault();
            return result ?? -1;

        }

 

The result is the same, but easier to read code and faster queries Smile

Lets take it a step further and make it even easier to read the code by introducing extension methods that allows you to provide default values for nullable types (Matthew, you a geek!)

        public static T FirstOrDefault<T>(this IQueryable<T> input, T defaultValue) where T : struct
        {
            return input.Select(z => (Nullable<T>)z).FirstOrDefault() ?? defaultValue;
        }

So our LINQ extension class looks like this

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace MATTHEWWILLS.NET.LINQ.Extensions
{
    public static class LinqToSqlExtension
    {


        public static Nullable<T> NullableFirstOrDefault<T>(this IQueryable<T> input) where T : struct
        {
            return input.Select(z => (Nullable<T>)z).FirstOrDefault();
        }

        public static T FirstOrDefault<T>(this IQueryable<T> input, T defaultValue) where T : struct
        {
            return input.Select(z => (Nullable<T>)z).FirstOrDefault() ?? defaultValue;
        }


        public static IQueryable<Nullable<T>> SelectNullable<T>(this IQueryable<T> input) where T : struct
        {
            return input.Select(z => (Nullable<T>)z);
        }
    }
}

Nice, now look how easy the code is to read and it is optimised on projects.

 

      public int GetCaseStudyIDByPageID(int pageID)
        {
            return _db.CaseStudies.Where(z => z.PageID == pageID).Select(z => z.CaseStudyId).FirstOrDefault(-1);
        }

The above produces the same result in the profiler:

SELECT TOP ( 1 ) [t1].[value]
FROM   (SELECT [t0].[CaseStudyId] AS [value],
               [t0].[PageID]
        FROM   [dbo].[CaseStudy] AS [t0]) AS [t1]
WHERE  [t1].[PageID] = 128 /* @p0 */

This anti pattern was used allot throughout their code, and caused double/triple calls to the DB for every page load. Therefore, using a profiling tool like NHProf, L2SProf or EHProf will save you and your customer/employee allot of money in the long term and perhaps save developers from picking up bad habits where IQuerable is being abused and treated like lists when in fact they execute on the backend.

We have solved this, and also provided a neat way of dealing with nullable types with  clean extension methods.

So we have solved scalar issues with FirstOrDefault and the code below which is easy to write to the untrained eye will not need to be thought of when invalid checks occur on nullable types by using the custom extension methods provided above.

var result = _db.CaseStudies.FirstOrDefault(i => i.PageID == pageID).Select(x=>x.PageId);

return result ?? -1; //Invalid Check

Also, see the repository pattern being used and people ended up with code with these stats:

clip_image002

Above, we have +- 700 SQL Statements being called, and each on average uses a data context. Try to have a minimum amount of data contexts. One Data Context can server all sorts of requests, so when choosing a repository pattern or strategy profile the number of data contexts created and try reduce them.

image

Above, is for one page load, not a good design pattern, so room to improve with some dependency injection and a singleton on the datacontext Smile

You can read other tips on profiling here:

https://romikoderbynew.wordpress.com/tag/nhprof/

Thanks to Matthew Wills again for some awesome tips on profiling Smile

Cheers