Hi Guys,
I spent some time working on full text search for Neo4j. The basic goals were as follows.
- Control the pointers of the index
- Full Text Search
- All operations are done via Rest
- Can create an index when creating a node
- Can update and index
- Can check if an index exists
- When bootstrapping Neo4j in the cloud run Index checks
- Query Index using full text search lucene query language.
Download:
This is based on Neo4jClient:
Source Code at:
Introduction
So with the above objectives, I decided to go with Manual Indexing. The main reason here is that I can put an index pointing to node A based on values in node B.
Imagine the following.
You have Node A with a list:
Surname, FirstName and MiddleName. However Node A also has a relationship to Node B which has other names, perhaps Display Names, Avatar Names and AKA’s.
So with manual indexing, you can have all the above entries for names in Node A and Node B point to Node A only.
So, in a Rest call to the Neo4j server, it would look something like this in Fiddler.

Notice the following:
Url: http://localhost:7474/db/data/index/node/{IndexName}/{Key}/{Value}
So, if we were adding 3 names for the SAME client from 2 different nodes. You would have the same IndexName and Key then with different values in the Url. The node pointer (In the request body) will then be the address to the Node.
Neo4jClient Nuget Package
I have updated the Neo4jClient which is on Nuget, to now support:
- Creating Exact or FullText Indexes on it’s own, so that it just exists
- Creating Exact or FullTest indexes when creating a node, the node reference will automatically be calculated.
- Updating an Index
- Deleting entries from an index.
Class diagram for the indexing solution in Neo4jClient.

RestSharp
The Neo4jClient package uses RestSharp, thus making all the index call operations a trivial task for us, so lets have a look at some of the code inside the client to see how to consume manual index api from .Net, and then in the next section well look how we consume this code from another application.
public Dictionary<string, IndexMetaData> GetIndexes(IndexFor indexFor)
{
CheckRoot();
string indexResource;
switch (indexFor)
{
case IndexFor.Node:
indexResource = RootApiResponse.NodeIndex;
break;
case IndexFor.Relationship:
indexResource = RootApiResponse.RelationshipIndex;
break;
default:
throw new NotSupportedException(string.Format("GetIndexes does not support indexfor {0}", indexFor));
}
var request = new RestRequest(indexResource, Method.GET)
{
RequestFormat = DataFormat.Json,
JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
};
var response = client.Execute<Dictionary<string, IndexMetaData>>(request);
if (response.StatusCode != HttpStatusCode.OK)
throw new NotSupportedException(string.Format(
"Received an unexpected HTTP status when executing the request.\r\n\r\n\r\nThe response status was: {0} {1}",
(int)response.StatusCode,
response.StatusDescription));
return response.Data;
}
public bool CheckIndexExists(string indexName, IndexFor indexFor)
{
CheckRoot();
string indexResource;
switch (indexFor)
{
case IndexFor.Node:
indexResource = RootApiResponse.NodeIndex;
break;
case IndexFor.Relationship:
indexResource = RootApiResponse.RelationshipIndex;
break;
default:
throw new NotSupportedException(string.Format("IndexExists does not support indexfor {0}", indexFor));
}
var request = new RestRequest(string.Format("{0}/{1}",indexResource, indexName), Method.GET)
{
RequestFormat = DataFormat.Json,
JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
};
var response = client.Execute<Dictionary<string, IndexMetaData>>(request);
return response.StatusCode == HttpStatusCode.OK;
}
void CheckRoot()
{
if (RootApiResponse == null)
throw new InvalidOperationException(
"The graph client is not connected to the server. Call the Connect method first.");
}
public void CreateIndex(string indexName, IndexConfiguration config, IndexFor indexFor)
{
CheckRoot();
string nodeResource;
switch (indexFor)
{
case IndexFor.Node:
nodeResource = RootApiResponse.NodeIndex;
break;
case IndexFor.Relationship:
nodeResource = RootApiResponse.RelationshipIndex;
break;
default:
throw new NotSupportedException(string.Format("CreateIndex does not support indexfor {0}", indexFor));
}
var createIndexApiRequest = new
{
name = indexName.ToLower(),
config
};
var request = new RestRequest(nodeResource, Method.POST)
{
RequestFormat = DataFormat.Json,
JsonSerializer = new CustomJsonSerializer {NullHandling = JsonSerializerNullValueHandling}
};
request.AddBody(createIndexApiRequest);
var response = client.Execute(request);
if (response.StatusCode != HttpStatusCode.Created)
throw new NotSupportedException(string.Format(
"Received an unexpected HTTP status when executing the request..\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
indexName,
(int) response.StatusCode,
response.StatusDescription));
}
public void ReIndex(NodeReference node, IEnumerable<IndexEntry> indexEntries)
{
CheckRoot();
var nodeAddress = string.Join("/", new[] {RootApiResponse.Node, node.Id.ToString()});
var updates = indexEntries
.SelectMany(
i => i.KeyValues,
(i, kv) => new {IndexName = i.Name, kv.Key, kv.Value});
foreach (var update in updates)
{
if (update.Value == null)
break;
string indexValue;
if(update.Value is DateTimeOffset)
{
indexValue = ((DateTimeOffset) update.Value).UtcTicks.ToString();
}
else if (update.Value is DateTime)
{
indexValue = ((DateTime)update.Value).Ticks.ToString();
}
else
{
indexValue = update.Value.ToString();
}
AddNodeToIndex(update.IndexName, update.Key, indexValue, nodeAddress);
}
}
public void DeleteIndex(string indexName, IndexFor indexFor)
{
CheckRoot();
string indexResource;
switch (indexFor)
{
case IndexFor.Node:
indexResource = RootApiResponse.NodeIndex;
break;
case IndexFor.Relationship:
indexResource = RootApiResponse.RelationshipIndex;
break;
default:
throw new NotSupportedException(string.Format("DeleteIndex does not support indexfor {0}", indexFor));
}
var request = new RestRequest(string.Format("{0}/{1}", indexResource, indexName), Method.DELETE)
{
RequestFormat = DataFormat.Json,
JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
};
var response = client.Execute(request);
if (response.StatusCode != HttpStatusCode.NoContent)
throw new NotSupportedException(string.Format(
"Received an unexpected HTTP status when executing the request.\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
indexName,
(int)response.StatusCode,
response.StatusDescription));
}
void AddNodeToIndex(string indexName, string indexKey, string indexValue, string nodeAddress)
{
var nodeIndexAddress = string.Join("/", new[] { RootApiResponse.NodeIndex, indexName, indexKey, indexValue });
var request = new RestRequest(nodeIndexAddress, Method.POST)
{
RequestFormat = DataFormat.Json,
JsonSerializer = new CustomJsonSerializer { NullHandling = JsonSerializerNullValueHandling }
};
request.AddBody(string.Join("", client.BaseUrl, nodeAddress));
var response = client.Execute(request);
if (response.StatusCode != HttpStatusCode.Created)
throw new NotSupportedException(string.Format(
"Received an unexpected HTTP status when executing the request.\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
indexName,
(int)response.StatusCode,
response.StatusDescription));
}
public IEnumerable<Node<TNode>> QueryIndex<TNode>(string indexName, IndexFor indexFor, string query)
{
CheckRoot();
string indexResource;
switch (indexFor)
{
case IndexFor.Node:
indexResource = RootApiResponse.NodeIndex;
break;
case IndexFor.Relationship:
indexResource = RootApiResponse.RelationshipIndex;
break;
default:
throw new NotSupportedException(string.Format("QueryIndex does not support indexfor {0}", indexFor));
}
var request = new RestRequest(indexResource + "/" + indexName, Method.GET)
{
RequestFormat = DataFormat.Json,
JsonSerializer = new CustomJsonSerializer {NullHandling = JsonSerializerNullValueHandling}
};
request.AddParameter("query", query);
var response = client.Execute<List<NodeApiResponse<TNode>>>(request);
if (response.StatusCode != HttpStatusCode.OK)
throw new NotSupportedException(string.Format(
"Received an unexpected HTTP status when executing the request.\r\n\r\nThe index name was: {0}\r\n\r\nThe response status was: {1} {2}",
indexName,
(int) response.StatusCode,
response.StatusDescription));
return response.Data == null
? Enumerable.Empty<Node<TNode>>()
: response.Data.Select(r => r.ToNode(this));
}
Using the Neo4jClient from within an application
Create an Index and check if it exists
This is useful when bootstrapping Neo4j, to see if there are any indexes that SHOULD be there and are not, so that you can enumerate all the nodes for that index and add entries.
public void CreateIndexesForAgencyClients()
{
var agencies = graphClient
.RootNode
.Out<Agency>(Hosts.TypeKey)
.ToList();
foreach (var agency in agencies)
{
var indexName = IndexNames.Clients(agency.Data);
var indexConfiguration = new IndexConfiguration
{
Provider = IndexProvider.lucene,
Type = IndexType.fulltext
};
if (!graphClient.CheckIndexExists(indexName, IndexFor.Node))
{
Trace.TraceInformation("CreateIndexIfNotExists {0} for Agency Key {0}", indexName, agency.Data.Key);
graphClient.CreateIndex(indexName, indexConfiguration, IndexFor.Node);
PopulateAgencyClientIndex(agency.Data);
}
}
}
Create an Index Node Entry when creating a node
var indexEntries = GetIndexEntries(agency.Data, client, clientViewModel.AlsoKnownAses);
var clientNodeReference = graphClient.Create(
client,
new[] {new ClientBelongsTo(agencyNode.Reference)}, indexEntries);
public IEnumerable<IndexEntry> GetIndexEntries(Agency agency, Client client, IEnumerable<AlsoKnownAs> alsoKnownAses)
{
var indexKeyValues = new List<KeyValuePair<string, object>>
{
new KeyValuePair<string, object>(AgencyClientIndexKeys.Gender.ToString(), client.Gender)
};
if (client.DateOfBirth.HasValue)
{
var dateOfBirthUtcTicks = client.DateOfBirth.Value.UtcTicks;
indexKeyValues.Add(new KeyValuePair<string, object>(AgencyClientIndexKeys.DateOfBirth.ToString(), dateOfBirthUtcTicks));
}
var names = new List<string>
{
client.GivenName,
client.FamilyName,
client.PreferredName,
};
if (alsoKnownAses != null)
{
names.AddRange(alsoKnownAses.Where(a => !string.IsNullOrEmpty(a.Name)).Select(aka => aka.Name));
}
indexKeyValues.AddRange(names.Select(name => new KeyValuePair<string, object>(AgencyClientIndexKeys.Name.ToString(), name)));
return new[]
{
new IndexEntry
{
Name = IndexNames.Clients(agency),
KeyValues = indexKeyValues.Where(v => v.Value != null)
}
};
}
Reindex a node
Notice there was a call to PopulateAgencyClientIndexin in the code, this is done in our bootstrap to ensure indexes are always there as expected, and if for some reason they are not, then they created and populated by using reindex feature.
void PopulateAgencyClientIndex(Agency agency)
{
var clients = graphClient
.RootNode
.Out<Agency>(Hosts.TypeKey, a => a.Key == agency.Key)
.In<Client>(ClientBelongsTo.TypeKey);
foreach (var client in clients)
{
var clientService = clientServiceCallback();
var akas = client.Out<AlsoKnownAs>(IsAlsoKnownAs.TypeKey).Select(a => a.Data);
var indexEntries = clientService.GetIndexEntries(agency, client.Data, akas);
graphClient.ReIndex(client.Reference, indexEntries);
}
}
Querying a full text search index using Lucene
Below is sample code to query full text search. Basically your index entries for a person with
Name: Bob, Surname:Van de Builder, Aka1: Bobby, Aka2: Bobs, PrefferedName: Bob The Builder
The index entries will need to look like the
Key:Value
Name: Bob
Name:Van
Name:de
Name: Builder
Name: Bobby
Name: Bobs
Remember, Lucene has a white space analyser, so any names with spaces MUST become a new index entry, so what we do is split out names based on whitespaces and this becomes our collection of IndexEntries. The above is related to full text search context.
Note: If using EXACT Index match, then composite entries are needed for multiple words, since you no longer using lucene full text search capabilities. e.g.
Name: Bob The Builder
This is good to know, because things like postal code searches or Gender where exact matches are required do not need full text indexes.
Lets check out an example of querying an index.
[Test]
public void VerifyWhenANewClientIsCreateThatPartialNameCanBeFuzzySearchedInTheFullTextSearchIndex()
{
using (var agency = Data.NewTestAgency())
using (var client = Data.NewTestClient(agency, c =>
{
c.Gender = Gender.Male;
c.GivenName = "Joseph";
c.MiddleNames = "Mark";
c.FamilyName = "Kitson";
c.PreferredName = "Joey";
c.AlsoKnownAses = new List<AlsoKnownAs>
{
new AlsoKnownAs {Name = "J-Man"},
new AlsoKnownAs {Name = "J-Town"}
};
}
))
{
var indexName = IndexNames.Clients(agency.Agency.Data);
const string partialName = "+Name:Joe~+Name:Kitson~";
var result = GraphClient.QueryIndex<Client>(indexName, IndexFor.Node, partialName);
Assert.AreEqual(client.Client.Data.UniqueId, result.First().Data.UniqueId);
}
}
Dates
Notice that in some of the code, you may have noticed that when I store date entries in the index, I store them as Ticks, so this will be as long numbers, this is awesome, as it gives raw power to searching dates via longs 
[Test]
public void VerifyWhenANewClientIsCreateThatTheDateOfBirthCanBeRangeSearchedInTheFullTextSearchIndex()
{
// Arrange
const long dateOfBirthTicks = 634493518171556320;
using (var agency = Data.NewTestAgency())
using (var client = Data.NewTestClient(agency, c =>
{
c.Gender = Gender.Male;
c.GivenName = "Joseph";
c.MiddleNames = "Mark";
c.FamilyName = "Kitson";
c.PreferredName = "Joey";
c.DateOfBirth = new DateTimeOffset(dateOfBirthTicks, new TimeSpan());
c.CurrentAge = null;
c.AlsoKnownAses = new List<AlsoKnownAs>
{
new AlsoKnownAs {Name = "J-Man"},
new AlsoKnownAs {Name = "J-Town"}
};
}
))
{
// Act
var indexName = IndexNames.Clients(agency.Agency.Data);
var partialName = string.Format("DateOfBirth:[{0} TO {1}]", dateOfBirthTicks - 5, dateOfBirthTicks + 5);
var result = GraphClient.QueryIndex<Client>(indexName, IndexFor.Node, partialName);
// Assert
Assert.AreEqual(client.Client.Data.UniqueId, result.First().Data.UniqueId);
}
}
Summary
Well, I hope you found this post useful. Neo4jClientis on nuget, so have a bash using it and would love to know your feedback.
Download
NuGetPackage:
Source Code at:
Cheers
You must be logged in to post a comment.