YottaDB Support for Multi-Threaded Applications
The YottaDB database engine operates within the address space of the application process, and is itself single threaded. Why? There is a historical reason as well as a current reason.
When the YottaDB code base was first developed in the 1980s, operating systems did not support multiple threads within a process. Furthermore, for many years, the database was tightly coupled to the M language, which supports multiple concurrent processes, but not multi-threaded processes. Today, even as it continues to support M, YottaDB has grown beyond M: YottaDB provides a tight integration with C, which does support threads, and through C to languages.
Since the YottaDB code base has been continuously invested in, why is it still single-threaded more than thirty years after it first saw daily production use in a mission-critical application? The reason is that there is not an obviously significant benefit from making the engine itself multi-threaded. Whether updates are generated by multiple processes or multiple threads, database accesses must still be internally consistent, database updates must still be serialized, and transactions must ensure ACID properties. YottaDB gets its extraordinary performance in part because processes use shared memory to cooperate in managing the database. It is not just the logic of the database engine that is in the address space of the process – the shared memory control structures and data are also in the address space of the process. With the threads in a process already sharing an address space, we have not been able to quantify a benefit from making the engine itself multi-threaded. While there is very likely at least some performance benefit to making the engine multi-threaded, we have not at this time identified a performance benefit significant enough to warrant the investment.
While there may not be value in making the database engine multi-threaded, there is definitely value in allowing multi-threaded applications to benefit from integration with YottaDB – applications are multi-threaded, and can benefit from YottaDB’s data management capabilities. What is needed is an API that multi-threaded applications can use to access YottaDB. This post discusses the need for differences in the API used by single-threaded applications vs. multi-threaded applications, and how YottaDB’s single-threaded engine supports multi-threaded applications.1
Single- vs. Multi-Threaded Applications
Function calls are synchronous. When a single-threaded application calls a YottaDB function such as ydb_get_s(), the YottaDB engine shares a call-stack with the application code and executes in the same thread. Except for recursive calls that implement transaction processing (see below), the application is blocked until YottaDB returns to the caller.
In a multi-threaded application, the YottaDB engine cannot execute in the same thread as the caller, because after servicing a request from one thread, application logic in a different thread may require service. Therefore, the YottaDB engine must execute in its own thread, different from that of any application thread. Instead of calling ydb_get_s(), for example, the application thread calls the function ydb_get_st() which executes in the same thread as the caller. ydb_get_st() puts a message in a process-private queue for the thread executing the YottaDB engine, then waits for and returns the response from YottaDB to the caller. While the call to ydb_get_st() is synchronous for the caller, which is blocked till ydb_get_st() returns, other threads continue execution. The YottaDB engine services queued requests. With the exception of a tptoken parameter that the multi-threaded functions require (see Transaction Processing below), the APIs track each other with a few differences:
- As YottaDB can execute either in the same thread as application code that calls it, or in a different thread, but not both, an application can either use the single threaded API or the multi-threaded API, but not both. The first call to the YottaDB engine initializes the engine appropriately, and subsequent calls must match that initialization (or get an error).
- As the M language is single-threaded, any execution of M, either from the shell, or a call-in from C to M initializes the engine for operation in a single-thread.
- Some utility functions simply execute in the thread as the application code that calls them, whether in single- or multi-threaded applications, and do not need separate single- and multi-threaded versions.
YottaDB uses Optimistic Concurrency Control to implement high throughput transaction processing with fully ACID transactions. Single threaded applications call ydb_tp_s() and provide it with a function f_t() to execute the logic for the transaction. ydb_tp_s() calls f_t(), which in turn can call YottaDB directly, or call other functions that call YottaDB.
Transaction logic that executes across multiple threads in a single process introduces complications not found in transaction logic that executes in a single thread. For example:
- A process has two application threads, T1 and T2, with YottaDB executing in thread T3. T1 calls ydb_tp_st() to perform a transaction whose logic is in function ft(). ydb_tp_st() queues a request for YottaDB and awaits a response; the caller is blocked till ydb_tp_st() returns.
- YottaDB spawns a thread T4 executing function ft(). Now both T2 and T4 are actively executing application logic.
- YottaDB receives a message on its queue. YottaDB needs a way to determine whether the request is from the logic in T2 which should block until the ydb_tp_st() called by T1 returns, or from T4, which YottaDB should process.
In order to determine whether a request on its queue is from T2 or T4 – and in the general case, to determine whether a request on the queue should be serviced – function calls that support threaded applications include a tptoken parameter as follows:
- When application code that is not inside a transaction calls YottaDB, it passes a tptoken value specified by the symbolic constant YDB_NOTTP.
- When ydb_tp_s() calls ft(), it generates and passes to ft() a tptoken value.
- ft(), as well as any calls to YottaDB made by any threads and functions that are part of the transaction logic of ft() should use the value provided by YottaDB to ft() when calling YottaDB.
In the example above, threads T1 and T2 would use YDB_NOTTP as the tptoken value. When YottaDB spawns thread T4 executing ft(), it generates a tptoken and passes it to ft(). When the transaction logic in T4 calls YottaDB, it passes in this tptoken. Thus, when YottaDB gets a message in step 3 of the example above, it determines the action to take based on tptoken:
- If it is the value it passed to ft() (the current token), it should act on the request.
- If it is YDB_NOTTP, YottaDB should defer acting on the request until ft() completes and YottaDB receives the return code.
- If it is any other value, YottaDB should respond to the message with an error.
More generally, YottaDB maintains a stack of tptoken values it has issued and which have not been closed (as a consequence of the functions implementing transaction processing logic completing their work). When receiving a request, if its tptoken is
- that of the current transaction level, i.e., on top of the stack, act on the request;
- that of a transaction level within the stack, defer acting on it till that token comes to the top of the stack; and
- any other value, it is an error.
In other words, tptoken is a transaction context. Except for calls to YottaDB that are outside any transaction context, and so identified by YDB_NOTTP, calls to ydb_tp_st() create transaction contexts that YottaDB provides to functions implementing transaction logic, and which they in turn must provide when calling YottaDB. The datatype of tptoken is opaque to application software (at least, as opaque as anything can be in the world of free / open source software), and if application code attempts any operation on a tptoken other than passing it back to YottaDB as a parameter, the consequences are unpredictable and likely to be hard to debug.
Since a tptoken is a context unassociated with a thread, it also works when a language like Go interfaces with YottaDB, which accesses YottaDB through the Simple API for threaded applications. A potentially large number (hundreds to thousands) of Goroutines are dynamically scheduled over a smaller number of process threads, and in the course of its lifetime, a Goroutine may migrate to execute on many threads. But the transaction context established by a tptoken stays with a Goroutine regardess of the thread on which it is executing.
Effective r1.242, YottaDB supports Simple API functions for threaded applications as field test grade functionality in a production release, with the intention of supporting them as production grade functionality in the next release. Access from Go is available as a wrapper that accesses YottaDB r1.24 & up via the Simple API for threaded applications, and in turn exposes a Go API that Go application code can use. It is our intention in the future to add wrappers for other languages, also accessing YottaDB via the Simple API for threaded applications.
Documentation is in the Multi-Language Programmers Guide. Please do use the Simple API for threaded applications from C, Go, or another language, and please do tell us what you think.
. Except for transaction processing functionality, a single-threaded application can call the multi-threaded functions, but not vice versa.↩
. Software and documentation expected to be released late 2018 / early 2019; see the code under development at https://gitlab.com/YottaDB/DB/YDB .↩
Featured Image : Woven Silk. Jim Thompson House, Bangkok, Thailand by Ranjani Hathaway
 A weaver works over a traditional ‘pit loom’, the ancient method of weaving cloth, in which the weaver sits with his legs in a pit with the looms spread out before him on the floor. He uses two pedals near his feet to separate the looms to create different patterns. Location: Near Pokhran, Rajasthan by Ranjitmonga
 A threaded needle by Jorge Barrios
Published on December 03, 2018