Programming Notes (Avoiding Common Pitfalls)

As YottaDB is likely different from other data management and persistence engines you have used, this section provides information about features of YottaDB intended to help you avoid common pitfalls.

Numeric Considerations

To ensure the accuracy of financial calculations, [1] YottaDB internally stores numbers as, and performs arithmetic using, a scaled packed decimal representation with 18 significant decimal digits, with optimizations for values within a certain subset of its full range. YottaDB efficiently converts between strings and numbers.

When passed a string that is a canonical number for use as a subscript, YottaDB automatically converts it to a number. This automatic internal conversion is immaterial for applications:

  • that simply store and retrieve data associated with subscripts, potentially testing for the existence of nodes; or

  • whose subscripts are all numeric, and should be collated in numeric order.

This automatic internal conversion is material to applications that use:

  • numeric subscripts and expect the subscripts to be sorted in lexical order rather than numeric order; or

  • mixed numeric and non-numeric subscripts, including subscripts that are not canonical numbers.

Applications that are affected by automatic internal conversion should prefix their subscripts with a character such as "x" which ensures that subscripts are not canonical numbers.

In contexts where a string is coerced to a number (for example ydb_incr_s() / ydb_incr_st()) the coercion rules are the same as the M “+” unary operator.

Canonical Numbers

Conceptually, a canonical number is a string from the Latin character set that represents a decimal number in a standard, concise, form.

  1. Any string of decimal digits, optionally preceded by a minus sign ("-"), the first of which is not "0" (except for the number zero itself), that represents an integer of no more than 18 significant digits.

    • The following are canonical numbers: "-1", "0", "3", "10", "99999999999999999999", "999999999999999999990". Note that the last string has only 18 significant digits even though it is 19 characters long.

    • The following are not canonical numbers: "+1" (starts with "+"), "00" (has an extra leading zero), "999999999999999999999" (19 significant digits), "-0" (the canonical representation of 0 is "0").

  2. Any string of decimal digits, optionally preceded by a minus sign that includes one decimal point ("."), the first and last of which are not "0", that represents a number of no more than 18 significant digits.

    • The following are canonical numbers: "-.1", ".3", ".99999999999999999999".

    • The following are not canonical numbers "+.1" (starts with "+"), "0.3" (first digit is "0"), ".999999999999999999990" (last digit is "0"), ".999999999999999999999" (more than 18 significant digits).

  3. Any of the above two forms followed by "E" (upper case only) followed by a canonical integer in the range -43 to 47 such that the magnitude of the resulting number is between 1E-43 through .1E47.

Checking whether the string '0' follows a value in subscript collation order determines whether the value is a canonical string. In M code, if x is the value, this would be written $zlength(x)&($char(0)]]x) (see String Relational Operators).

Zwrite Format

Strings used as subscripts and as values can include unprintable bytes, for example control characters or binary data. YottaDB's zwrite format is an encoding in printable ASCII of any sequence of bytes. Unlike formats such as Base64, the zwrite format attempts to preserve readability of printable ASCII characters. Note that a zwrite formatted string is always longer than the original string (at the very least, it has enclosing quotes).

Signals

As YottaDB includes a real-time database engine that resides within the address space of a process, applications that use signals must not interfere with database operation.

When the YottaDB database engine initializes on the first call into the API, it initializes signal handling as follows:

  • SIGALRM – YottaDB uses this signal extensively and sets its own signal handler for SIGALRM. Application code should not use SIGALRM, and must never replace YottaDB's handler. YottaDB provides an API for applications that need timing functionality (see Utility Functions).

  • SIGCHLD (formerly SIGCLD) – Set to SIGDFL` for the default action.

  • SIGTSTP, SIGTTIN, and SIGTTOU – As suspending a real-time database engine at an inopportune moment is undesirable, YottaDB sets a signal handler for these signals that defers process suspension until the engine is in a state where it is safe to suspend.

  • SIGCONT - YottaDB sets a handler that continues a suspended process, and does nothing if the process is not suspended.

  • SIGUSR1 – As YottaDB uses this signal to asynchronously execute the M code in the $zinterrupt intrinsic special variable, it sets an appropriate handler. If non-M code is currently active when the process receives a SIGUSR1, the handler defers the signal till such time as M code is active. If an application uses no M code whatsoever, and does not intend to, it can change the SIGUSR1 handler after the first call to YottaDB. If an application has, or in the future may have, M code, it is best to leave the YottaDB handler in place.

  • SIGUSR2 – If the environment variable ydb_treat_sigusr2_like_sigusr1 is not set, YottaDB sets a SIG_IGN handler. SIGUSR2 is available for applications to use. To do so, set a handler after the first call to YottaDB. If the environment variable is set, YottaDB invokes $zinterrupt as described above, except that it also sets the $zyintrsig environment variable to 1 so that the code in $zinterrupt can determine which signal it is responding to.

  • SIGINT - When the main program is yottadb, YottaDB sets a handler for SIGINT (aka Ctrl-C) and the behavior is as documented at CENABLE. When the main program is not yottadb (i.e. a call-in or Simple API) see below.

  • SIGABRT, SIGBUS, SIGFPE, SIGILL, SIGIOT, SIGSEGV and SIGTRAP – These signals are fatal, and the YottaDB handler terminates the process with a core dump. See the discussion about core dumps in the description of ydb_fork_n_core(). Although YottaDB normally cleans up processes' interaction with databases on exit, these signals can indicate that the process is in a bad state and that its code and data cannot be trusted. The process therefore does not attempt to clean up before exit. After a fatal signal, no YottaDB functions can be called except ydb_exit(). In the event an application must use its own handler for one of these signals, it must either save YottaDB's handler, and drive it before process termination or call ydb_exit() prior to process exit. [2]

In the case of a main program that calls YottaDB through the Simple API, YottaDB handles SIGINT, SIGQUIT. and SIGTERM as follows:

  • The first call by the main program into YottaDB establishes YottaDB signal handlers for these signals overriding any signal handlers established by the main program.

  • On subsequent receipt of one of these signals, the YottaDB signal handler checks if the database engine is in a state where it can be safely interrupted.

    • If not safe (in the middle of a non-interruptible operation, like database commit logic), YottaDB defers handling the signal until the database engine is in a safe state, when the operation finishes.

    • If safe, the YottaDB signal handler invokes the signal handler, if any, defined by the main program. There are two special cases:

      • If the main program has not established a signal handler for these signals, the default signal handler (SIG_DFL) would be in effect. Since these signals are considered fatal/terminating signals by YottaDB, it terminates the process.

      • If the main program has explicitly set these signals to be ignored (SIG_IGN), then YottaDB signal does not drive any application signal handler for these signals.

    • However, if three such signals are sent in succession (and the process has still not terminated), the signal handler proceeds to terminate the process immediately even if it is in the middle of a non-interruptible operation.

A consequence of this behavior for Flask applications is that:

  • Ctrl-C on an interactive Flask application, that uses the YDBPython wrapper to make calls into YottaDB, stops the application because Flask sets up a signal handler for Ctrl-C (SIGINT) to terminate the process.

  • SIGTERM on a non-interactive Flask application, that uses the YDBPython wrapper to make calls into YottaDB, terminates the application because Flask sets the SIGTERM handler to SIG_DFL (the default signal handler).

YottaDB saves an application's signal handler during initialization and restores it if ydb_exit() is explicitly called prior to process exit. YottaDB does not reset existing signal handlers for signals it does not handle but calls the saved signal handler if the YottaDB handler returns (and doesn't exit).

As database operations such as ydb_set_s() / ydb_set_st() set timers, subsequent system calls can terminate prematurely with an EINTR. Such system calls should be wrapped to restart them when this occurs. An example from the file eintr_wrappers.h demonstrates how YottaDB itself is coded to handle system calls that terminate prematurely with an EINTR:

#define FGETS_FILE(BUF, LEN, FP, RC)                            \
{                                                               \
        do                                                      \
        {                                                       \
                FGETS(BUF, LEN, FP, RC);                        \
        } while (NULL == RC && !feof(FP) && ferror(FP) && EINTR == errno);      \
}

If YottaDB is used within a process with other code that cannot co-exist, or be made to co-exist, with YottaDB, for example, by safely saving and restoring handlers, separate the logic into multiple processes or use a client/server database configuration to place application logic and the database engine in separate processes (see Client/Server Operation).

Note

To reiterate because of its importance: never replace YottaDB's SIGALRM handler.

Forking

In this section, fork() refers to the fork() system call as well as other functions that may use fork() under the covers or effect similar functionality by other means.

Before a process that performs buffered IO executes fork(), it should execute fflush(). Otherwise, the child process will inherit unflushed buffers from the parent, which the child process will flush when it executes an fflush(). This is a general programming admonition, not specific to YottaDB except to the extent that M code within a parent process may have executed write commands which are still buffered when C code within the same process calls fork().

An application that calls YottaDB functions from multiple threads within a process must ensure that only one thread at a time calls fork(). Failure to do so can result in unanticipated results, including abnormal process termination and structural damage to database files.

Threads

Important Notes:

  • Local variables, locks and transaction contexts are held by the process and not by the thread. In other words, these resources are shared by threads in a multi-threaded application, and YottaDB assumes that the threads of an application cooperate to manage the resources, e.g.

    • One thread may set a local variable node, and another thread may delete it.

    • One thread may acquire a lock and another may release it.

    • A global variable update within a transaction by one thread is immediately visible to another thread within the process, but is not visible to other processes until the transaction commits.

  • It is the responsibility of the application to avoid race conditions between threads in their use of resources managed by YottaDB at the level of the process. YottaDB does not ensure the absence of race conditions in accessing these resources because to do so would unduly restrict the freedom of application designers. For example, it is a legitimate design pattern to have one thread that provides one subscript of a node, and a different thread that provides a different subscript.

  • Simple API functions use an *errstr parameter to avoid a race condition and ensure they get the correct $zstatus when function has an error return. If an application calls ydb_get_s() / ydb_get_st() for the value of $zstatus for the complete error text when a YottaDB function returns an error return code, for a single-threaded application, $zstatus has correct and current information, since calls to YottaDB are entirely under the control of that single application thread. For a multi-threaded application, between the time a function returns with an error return code, and a subsequent call to ydb_get_s() / ydb_get_st() to get the value of $zstatus, another thread may call YottaDB, and the $zstatus returned will be from that subsequent call. A *errstr parameter in functions for multi-threaded applications provides the $zstatus for that call to the caller.

    • An application that does not want the $zstatus string can pass a NULL value for *errstr.

    • The string in errstr->buf_addr is always null terminated, which allows *errstr to be passed to standard system functions like printf().

    • In the event a buffer provided by an application is not long enough for a $zstatus, YottaDB truncates the string to be reported, rather than issuing an INVSTRLEN error (since a second error while attempting to report an error is likely to add confusion rather than enlightenment).

      • errstr->len_used is always set to the length of $zstatus, whether or not it is truncated.

      • If errstr->len_used is greater than errstr->len_alloc-1 it means $zstatus has been truncated.

Note that effective release r1.34 errstr is filled in appropriately if an error occurs in M code called from another language.

  • A multi-threaded application is permitted to use the YottaDB single-thread functions as long as the application ensures that all YottaDB access is performed only by one thread. A thread may use the ydb_thread_is_main() to determine whether it is the thread that is calling YottaDB. YottaDB strongly recommends against this application design pattern: this functionality only exists to provide backward compatibility to a specific existing application code base.

Even though the YottaDB data management engine is single-threaded and operates in a single thread, [3] it supports both single- and multi-threaded applications. Multi-threaded applications may call multi-threaded Simple API functions – those whose names end in _st() – as well as utility functions – those whose names end in _t(). Single-threaded applications may call the Simple API single-threaded functions – those whose names end in _s() – as well as utility functions – those whose names do not end in _t(). An application must not call both single-threaded and multi-threaded Simple API functions, and any attempt to do so results in a YottaDB error returned to the caller.

When a single-threaded application calls a YottaDB function, the application code blocks until YottaDB returns, the standard single threaded application behavior for a function call, also known as synchronous calls.

In a multi-threaded application, the YottaDB engine runs in its own thread, which is distinct from any application thread. When a multi-threaded application calls a YottaDB function, the function puts a request on a queue for the YottaDB engine, and blocks awaiting a response – in other words, any call to YottaDB is synchronous as far as the caller is concerned, even if servicing that call results in asynchronous activity within the process. Meanwhile, other application threads continue to run, with the YottaDB engine handling queued requests one at at time. An implication of this architecture is that multi-threaded functions of the Simple API cannot recurse – a call to a multi-threaded function when another is already on the C stack of a thread results in a SIMPLEAPINEST error. While this is conceptually simple for applications that do not use Transaction Processing, transaction processing in a threaded environment requires special consideration (see Threads and Transaction Processing).

Programming in M is single-threaded and single-threaded applications can call into M code, and M code can call single threaded C code as documented in Chapter 11 (Integrating External Routines) of the M Programmers Guide. Multi-threaded C applications are able to call M code through the ydb_ci_t() and ydb_cip_t() functions as documented here, with the restriction that if M code called through ydb_ci_t() or ydb_cip_t() calls out to C code, that C code is not permitted to start a transaction using ydb_tp_st().

Note that triggers, which are written in M, run in the thread of the YottaDB engine, and are unaffected by multi-threaded Simple API calls already on an application process thread's stack. However, if a trigger calls C code, and that C code calls ydb_ci_t() or ydb_cip_t(), that C code is not permitted to call ydb_tp_st().

Threads and Transaction Processing

As discussed in Transaction Processing, ydb_tp_s() / ydb_tp_st() are called with a pointer to the function that is called to execute an application's transaction logic.

In a single-threaded application, the YottaDB engine calls the TP function and blocks until it returns. The function may itself call YottaDB recursively, and the existence of a single thread ensures that any call to YottaDB occurs at the correct transaction nesting level.

In a multi-threaded application, the YottaDB engine invokes the TP function in another thread, but cannot block until it gets the message that the function has terminated with a value to be returned, because the engine must listen for messages from that function, as well as threads it spawns. Furthermore, one of those threads may itself call ydb_tp_s() / ydb_tp_st(). Therefore

  • The YottaDB engine must know the transaction nesting level at which it is operating, responding to requests for service at that level, and block any transaction invocations at a higher (enclosing) level until the current transaction is closed (committed or rolled back).

  • After a transaction has closed, any further calls from threads invoking YottaDB for the closed transaction must receive errors.

To accomplish this, the Simple API functions for threaded applications – those ending in _st() – have a tptoken first parameter used as follows to provide the required transaction context of a thread.

  • When an application calls a Simple API function outside a transaction, it provides a value of YDB_NOTTP for tptoken.

  • When an application calls ydb_tp_s() / ydb_tp_st(), it generates a tptoken as the first parameter when it calls the function that implements the logic for the transaction. Any threads that this function spawns must provide this tptoken to YottaDB. Passing in a different or incorrect tptoken can result in hard-to-debug application behavior, including deadlocks.

  • When a Simple API function is called:

    • If tptoken is that of the current transaction, the request is processed.

    • If tptoken is that of a higher level transaction within which the current transaction is nested, the call blocks until the nested transaction completes (or nested transactions complete, since there may be multiple nesting levels).

    • If tptoken does not correspond to a higher level transaction (e.g., if it corresponds to a closed transaction or a nonexistent one), YottaDB returns an error.

Note

If the function implementing a transaction spawns threads (or coroutines executing in threads), those threads/coroutines must:

  • terminate before the function returns to YottaDB;

  • use a current tptoken when invoking YottaDB (in effect, switching transaction contexts ­ technically this violates ACID transaction properties but perhaps reasonable in a few restricted cases, such as creating background worker threads); or

  • not invoke YottaDB.

Should a thread/coroutine spawned in a function implementing transaction logic invoke YottaDB after the function has returned, the thread/coroutine will get an invalid token error message unless it uses a current tptoken.

Note

Sharing or passing tptoken values between threads/coroutines can lead to deadlocks and other hard-to-debug situations. YottaDB strongly recommends against such usage. If you have a legitimate use case, design it so that you can debug it when the inevitable error condition occurs.

Timers and Timeouts

Although the Simple API uses nanosecond resolution to specify all time intervals, in practice underlying functions may have more granular resolutions (microseconds or milliseconds). Furthermore, even with a microsecond or millisecond resolution, the accuracy is always determined by the underlying hardware and operating system, as well as factors such as system load.

Memory Allocation

Memory allocated by ydb_malloc() must be explicitly freed by ydb_free(). ydb_exit() does not free memory, and any memory allocated but not freed prior to ydb_exit() is released only on process exit.

Syslog

Issues that pertain to the application and on which application code can take reasonable action are reported to the application (YDB_ERR_GVUNDEF being an example) and issues that pertain to operations and on which application code cannot take reasonable action but operations staff can (like running low on filesystem space, which are not discussed here, as this is a Programmers Guide) are reported to the syslog. In the event that a syslog does not exist (e.g., in default Docker containers), a process' syslog messages go to its stderr.

YottaDB uses the existence of /dev/log as an indicator of the existence of a syslog.

IO

Although YottaDB does not prohibit it, we recommend against performing IO to the same device from M and non-M code in a process unless you know exactly what you are doing and have the expertise to debug unexpected behavior. Owing to differences in buffering, and in the case of interactive sessions, setting terminal characteristics, performing IO to the same device from both M and non-M code will likely result in hard to troubleshoot race conditions and other behavior.