#Neo4j Gremlin queries with CopySplit/table leveraging Neo4jClient

Hi,

I would like to share gremlin querying using the .Net Neo4jClient.

Consider the following graph

image

The object is to produce a table of results that shows

  • ReferralDate (ReferralDecisionSection Node)
  • ReferralId (Referral Node)
  • FamilyName (User Node)
  • GivenName (User Node)
    The trick is we want to get all referrals but we also what referrals that do not have a who section, so the ReferralDate will be NULL. We also want to get referrals that are indirectly linked to a program (via a decision) but we also want the ones that are not indirectly linked to a program
ReferralDate ReferralId FamilyName GivenName
13 Jan 2012 1 Derbynew Romiko
NULL 2 Derbynew Romiko

So, what we doing is essentially left/right joins on ReferralNode, ReferralDecisionNode and Program.

Lets see how we can do this in .Net Neo4jClient

 return graphClient
                .RootNode
                .CopySplitV<Referral>((
                    new IdentityPipe()
                        .Out(Hosts.TypeKey, a => a.Key == userIdentifier.AgencyKey)
                        .In(UserBelongsTo.TypeKey, u => u.Username == userIdentifier.Username)
                        .Out(UserLinkedToProgram.TypeKey, p => p.Name == "Foundation")
                        .In(HasSuggestedProgram.TypeKey)
                        .In(ReferralHasDecisionsSection.TypeKey, r => r.Completed == false)
                        .AggregateV("ReferralWithProgramFoundation"),
                    new IdentityPipe()
                        .Out(Hosts.TypeKey, a => a.Key == userIdentifier.AgencyKey)
                        .In(ReferralBelongsTo.TypeKey, r => r.Completed == false)
                )
                .FairMerge()
                .ExceptV("ReferralWithProgramFoundation")
                .GremlinDistinct()
                .Out(ReferralHasWhoSection.TypeKey)
                .As("ReferralDate")
                .In(ReferralHasWhoSection.TypeKey)
                .As("ReferralId")
                .Out(CreatedBy.TypeKey, u => u.Username == userIdentifier.Username)
                .As("UserGivenName")
                .As("UserFamilyName")
                .Table(
                    who => who.ReferralDate,
                    referral => referral.UniqueId,
                    user => user.FamilyName,
                    user => user.GivenName
                );

Notice the following

  • CopySplit uses a concept of an identity pipe as a continuation of the previous output
  • CopySplit will execute the two queries in parallel
  • We are getting all the referrals in the system and then we are getting all the referrals in the system that have a program “Foundation”
  • We then store the referrals that have a program (“Foundation”) in a aggregate (variable)
  • We merge the parallel query results together with a FaireMerge
  • We exclude referrals that in a a Program called “Foundation”  with an Except
  • We then deduplicate results with GremlinDistinct
  • We then use AS to mark areas we need for table projections

Note: Using the AS clause within a CopySplit pipe in conjunction with table projections will produce undesired results, I am not sure if Gremlin supports such operations, if you know, please contact me.

Visit Marko Rodriguez for in depth discussions on Gremlin.

Advertisements

7 thoughts on “#Neo4j Gremlin queries with CopySplit/table leveraging Neo4jClient

  1. Pingback: Gremlin vs Cypher Initial Thoughts @Neo4j « Another Word For It

  2. Pingback: Gremlin vs Cypher Initial Thoughts @Neo4j « Romiko Derbynew's Blog

  3. Hi. I’m creating an application where I have to connect diffrent objects like activity to person, activity to company, room to building etc etc etc. As I see. Neo4j seems to be a good DB for this. How is the data stored in the DB? Is a node (i.e. person) created only once and linked n-times to other objects???? (Or is the data stored like in a document db?
    As I see. It’s much easer to search in a graph db. For example: I want to get all relations (in/out) of a specific room, is much easier to query than in a rdbms. Am I right?
    Where can I find examples/tutorials and documentations about neo4jclient? I need them VERY quick 🙂

  4. Me again… I want to search for nodes of a specific type (person). For this, I want to create a “person”-index. Is it possible to execute a “LIKE/CONTAINS” query on a index?
    For example: … WHERE FirstName LIKE ‘%123%’
    What is the name of this function?
    Thanks!

  5. Yes, you can, use the built in Lucene Full text Search =D

    public ClientSearchResultSet SearchClients(string agencyKey, ClientSearchCriteria clientSearchCriteria, int pageNumber)
    {
    const int pageSize = ConfigConstants.ClientSearchCountPerPage;

    var utcNow = timeService.DateTimeNowUtc();
    var fullclientNodes = new Node[] {};

    if (clientSearchCriteria.DoesAtLeastOneFieldHaveAValue)
    {
    IEnumerable<Node> clientNodes;
    var searchQuery = BuildSearchQueryFromCriteria(clientSearchCriteria, utcNow);
    if (string.IsNullOrWhiteSpace(searchQuery))
    {
    clientNodes = Enumerable.Empty<Node>().ToArray();
    }
    else
    {
    var agency = agencyService.GetAgencyByKey(agencyKey);
    var indexName = IndexNames.Clients(agency.Data);

    fullclientNodes = graphClient
    .QueryIndex(indexName, IndexFor.Node, searchQuery)
    .ToArray();

    clientNodes=fullclientNodes.Take(ConfigConstants.ClientSearchCountPerPage * ConfigConstants.ClientSearchMaxPages);
    }

    var resultsNotSorted = clientNodes
    .Select(c => BuildClientSearchResult(c, utcNow));

    var list = resultsNotSorted.ToList();
    if (!string.IsNullOrWhiteSpace(clientSearchCriteria.FamilyName) ||
    !string.IsNullOrWhiteSpace(clientSearchCriteria.GivenName))
    list.Sort(new ClientNameComparator(clientSearchCriteria));

    if (clientSearchCriteria.DateOfBirth != null || clientSearchCriteria.Age != null || clientSearchCriteria.Gender != null)
    list.Sort(new ClientDobAgeGenderComparator(clientSearchCriteria));

    var results = list.Skip((pageNumber – 1) * pageSize)
    .Take(pageSize)
    .ToList();

    return new ClientSearchResultSet
    {
    Results = results,
    TotalResultCount = fullclientNodes.Count(),
    SearchQuery = searchQuery
    };
    }

    var clientsQuery = graphClient
    .RootNode
    .Out(Hosts.TypeKey, a => a.Key == agencyKey)
    .In(ClientBelongsTo.TypeKey)
    .GremlinSkip(0)
    .GremlinTake(ConfigConstants.ClientSearchCountPerPage * ConfigConstants.ClientSearchMaxPages);

    var totalClients = graphClient
    .RootNode
    .Out(Hosts.TypeKey, a => a.Key == agencyKey)
    .In(ClientBelongsTo.TypeKey)
    .GremlinCount();

    var fullresultsnotsorted = clientsQuery
    .Select(c => BuildClientSearchResult(c, utcNow))
    .ToList();

    fullresultsnotsorted.Sort(new ClientIdComparator());

    var fullresults = fullresultsnotsorted.Skip((pageNumber – 1) * pageSize)
    .Take(pageSize)
    .ToList();

    return new ClientSearchResultSet
    {
    Results = fullresults,
    TotalResultCount = totalClients,
    SearchQuery = clientsQuery.QueryText
    };
    }

    public NodeReference CreateClient(Client client, Node agencyNode, IEnumerable indexEntries)
    {
    var uniqueIdGenerator = uniqueIdGeneratorCallback();
    client.UniqueId = uniqueIdGenerator.NextId(UniqueIdScopes.Clients(agencyNode.Data));

    var clientNodeReference = graphClient.Create(
    client,
    new[] { new ClientBelongsTo(agencyNode.Reference) },
    indexEntries);
    return clientNodeReference;
    }

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s