When performing a simple query such as retrieving every Thing in the DB, the Python client is around five times slower than the other clients at fetching the answers.
- Define a schema with a single attribute type,
first_name sub attribute, value string
- Insert 50,000 first names, initialised to random UUIDs
- Run and collect all answers of
match $x isa first_name in Client Python, and in TypeDB Studio
According to Performance Best Practices | gRPC :
Streaming RPCs create extra threads for receiving and possibly sending the messages, which makes streaming RPCs much slower than unary RPCs in gRPC Python, unlike the other languages supported by gRPC.
This implies that the behaviour we see (where Python is significantly slower at retrieving query answers than Studio, which uses Client Java) is expected in gRPC Python.
The gRPC team propose one potential fix, which is refactoring Client Python to use
asyncio. However, this is a significant undertaking, and with the development of
typedb-client-rust due to be completed in the fairly near future, we will be rewriting Client Python as a thin Python wrapper over an underlying Rust library - which should run optimally, and resolve this issue.
In the meantime, if optimality is essential for a product, I’d recommend using Client Java, Node.js or Julia, via a separate microservice that connects to the Python program if needs be.
This issue is tracked in Python client significantly slower when retrieving query results than Studio · Issue #257 · vaticle/typedb-client-python · GitHub .