After two summers of remote work, I can confidently say my work experience and communities are drastically different than they had been prior to the major shift out of the office. To me, it seems as though the more we turn to online work, the less we seem to be connecting, especially in person. This lack of connection can negatively impact the benefits found in a sense of belonging within a community, which in turn impacts the success of the individual as well as the company.
“Research conducted by Shawn Anchor showed that “work altruists were 10 times more likely to be engaged […] and 40% more likely to receive a promotion.” These people are the ones who are so connected to the mission of a company that they increase social interaction during a crisis, rather than decrease it.” https://www.urbanbound.com/blog/importance-of-community-in-the-workplace
Community, collaboration, and communication are vital to the success of an organization. These three things can breed authenticity in an individual to bring them further in their own career path and to further the mission of the company.
You may be thinking this all sounds nice, but what does it mean for productivity and profit at an organization? It turns out, companies lacking a positive sense of community are more likely to see high turnover rates, low employee morale, and unnecessary workplace drama, gossip, and power struggles. So it’s safe to say it’s a pretty important factor not just for employees to consider but for employers as well. https://www.kenzie.academy/blog/why-community-matters-in-the-workplace/
How can an individual know where to begin in seeking a community without the networking opportunities of being in the office? Community within a workplace is very important, so my focus of this article is how we could leverage graph technologies for workplace community discovery to help provide a starting place in community building.
But why should we be thinking of using graph technologies to find connections in our data? Unlike other database types, such as relational databases (think SQL, table representation), relationships are the first priority of graph databases such as Neo4j and TigerGraph. In a graph database, each node represents an entity like a person or a category. Each node is described by attributes. For example, an attribute of a person could be their email or their birthdate. The edge represents the relationship of how two nodes are associated. Because of the relationship priority of graph databases, they allow an authentic depiction of how the world interacts. This can be leveraged for many scenarios such as medical history, user product reviews, and in our case, community discovery.
Data Generation & TigerGraph
I utilized Online Data Generator to generate synthetic employee data to be used in my graph database. The generator produced 1000 random entries to a .csv file of synthetic employee data including: ID, Job Title, Email Address, FirstName LastName, Department, Course. This was enough data to start designing my graph schema with. The data was mapped, loaded, and explored using TigerGraph Graph Studio. Click here to learn more about TigerGraph. Linked below are a few more helpful TigerGraph resources:
Designing the Schema
After exploring the data generated in my .csv file, I was ready to start designing my graph schema in the ‘Design Schema’ tab. This schema contains vertices and edges. An Employee node has employee attributes such as email address and first and last name that describe each Employee node. An Employee node is connected to a Course node by the HAS_TAKEN_COURSE edge. This edge represents the relationship between an Employee node and a Course node. The same connections are shown below with the JobTitle and Department nodes as well.
After publishing my finished graph schema, I was ready to upload and map my synthetic employee data to my schema. The data mapping associates the fields in a set of data files to attributes of the vertex and edge types. The data from the .csv was easily mapped in the ‘Map Data To Graph’ tab.
After the data mapping was published, the data was ready to be loaded. In the image below, we can see the total number of vertices and edges of each type. One thing to note is the total number of edges; 6000! That means there are a lot of connections between Employees, Departments, Job Titles, and Courses!
After moving to the ‘Explore Graph’ tab of TigerGraph Studio, I chose to explore five vertices of all vertex types. From there I selected a few Department and Employee vertices and found a few cool connections. Below, you can see the many connections between Courses, Employees, Departments and the Job Titles associated with each other. There are meaningful connections to be made even in synthetic data!
Writing a Few Simple Queries
The final step was to write a few custom GSQL queries to gain more meaningful results based off different input in the ‘Write Queries’ tab. I started off with a simple query to get the results of all employees who have taken the Latin course. Show below is the query and the results! The same can be done for Job Titles and Departments with a few simple changes to the query.
Employee Group Query
Let’s say I know who is in my department already, and I can find out who else has taken the same course as I have, but can I find the employees that I share courses and job titles with that are not already in my department? The query below does just that. ‘Matthew Harris’ has a new employee group to join based off his own job and course interests that he may not have communicated with prior since the group is outside of his department!
Next Steps: Diverse and More Informative Data, Recommendation Algorithms
The group created above is a great starting point for Matthew Harris to find a few new employees in his organization who have similar interests to reach out to and connect with! But can this be improved to find more similar employee groups? The answer is yes! The next step would be to apply graph exploration algorithms with our queries. But there is a catch with our data… We don’t have the specific data we’d like to recommend more accurate employee groups. For example, a cosine similarity score can be calculated if we had data that contained employees who have taken multiple courses. Algorithms based off Collaborative Filtering are perfect for recommendation systems. It uses the similarities between the users and the items in parallel to provide recommendations. There are also many graph algorithms that capitalize on pieces of data like ratings, comments, discussions, views, etc., can be used to give weight to higher rated courses.