Hey,
I am trying to query the whole graph to turn it into a networkx graph and be able to analyze it in more detail.
To get the whole graph, I’m running the query:
match $x isa thing;
As far as I know, this is the fastest one possible for this purpose. In TypeDB Studio, this query gathers the whole graph (including the relations and attributes owned by each entity or relation). The problem I am experiencing is that when I run this query in the Python client, I can not find any information in the concept maps retrieved regarding what attributes are owned by what entities or relations or what entities are the relations connecting and with what roles. I only seem to get a list of all entities, relations, and attributes in the graph but without any information on how they are connected.
This makes it difficult for me to reconstruct the graph, I wonder if there is something I am missing since it seems like TypeDB Studio is able to reconstruct the graph successfully.
It sounds like you’ve succeeded in retrieving the vertices from TypeDB, but not the edges.
We have a remote concept API available through the Python client, see Thing | Vaticle. This provides three methods that should allow you to traverse the entire graph: get_has
, get_relations
for Thing
and get_players
for Relation
. By performing a simple graph search such as breadth-first or depth-first search (and using the get_iid
method on any Thing
(a Relation
also implements all Thing
methods) to ensure you don’t end up in a cycle) you should be able to extract the information you want.
A word of warning: these are remote concept API calls and as such, the overhead of going to and from the server on each API call will have a lower bound of your round trip time to your server (i.e., your ping). If you intend to generate your networkx graph in real time from a large graph stored in TypeDB, be aware that the Java client offers a local concept API which will be much faster.
1 Like
Yes, I intend to generate the local graph in real time. I saw that the Python client has this remote concept API, but I thought generating so much traffic between the client and the server was somehow inconvenient and slow, and was wondering if there was a way in the Python client to gather all the info needed quickly with just one query.
As you have suggested, I understand the best workaround would be to perform this query using the Java client and its local concept API.
1 Like
There’s definitely a trade off - if your graph is somewhat large (>1000 concepts) and the server is remote (i.e., not running on your machine) then the overhead will be noticable. However, if the server is running locally the RTT should be <1ms, so for a somewhat smaller graph you might save yourself the trouble of forming a translation layer between Java and Python to create your networkx graph.
Let us know how you get on! We’re always interested to see how people are making use of TypeDB.
1 Like
Yeah, I will let you know! For the moment I’m gonna try to implement it completely on Python, and if it seems to be too slow in the end I will consider the translation layer to Java.
I have another small question, is there any simple tweak I can make to the query I stated above to avoid retrieving all the attributes in the KG and instead just get all of the attributes that are owned by at least one entity or relation? In my use case, I’m constantly updating the attributes of a series of entities according to the flow of information in a network, so maybe only around 10% of the attributes I have created are assigned to an entity.
Thanks so much for the clear and fast responses!
The use case sounds interesting!
If we want all the attributes that have entities I think we’d express this in the form:
match $x isa entity, has $y; get $y;
We match all entities, then get all the attributes they have, then filter so we get just those attributes returned. This won’t include attributes of relations.
1 Like