DBMS Grammar

A SQL statement can be a Schema statement, a Data statement, or a SELECT statement.

A Schema statement creates and manipulates a unique schema within the database.

A Data statement is any statement that makes a change to the data in the database. Changes to data can be brought about by deleting data, inserting new data or updating existing data.

A SELECT statement is used to select and view data from the database.

Note

Some functions and keywords described in this documentation have yet to be implemented, and currently generate an error as we move toward a complete implementation.

Note

Comments can be placed within SQL statements using --, # or the /*...*/ symbols.

Accepted Data Types

Character Data Types

  • CHARACTER

  • CHAR

  • CHARACTER VARYING

  • CHAR VARYING

  • VARCHAR

Octo does not yet differentiate between these data types. All these types are currently treated as VARCHAR. They can be used to store strings and can be followed by an optional size which specifies the maximum character length (not the byte length which could be different in case of non-ascii characters) of a string that can be stored in this column. Example: VARCHAR(20) allows strings up to 20 characters to be stored.

As required by the SQL standard, an attempt to store a longer string into a column of these types will result in a VARCHAR_TOO_LONG error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length.

If the string to be stored is shorter than the maximum column size, the shorter string will be stored as is.

As required by the SQL standard, if one explicitly casts a value to VARCHAR(n), then an over-length value will be truncated to n characters without raising an error.

In general, Octo treats all single-quoted string literals (for example, 'str') as having a VARCHAR data type.

As an exception, Octo treats a special set of single-quoted string literals (for example 'f', 't' etc.) as BOOLEAN data type (and not VARCHAR) if used in a context that expects a boolean value. Note though that if the very same special literal is not used in a context that expects a boolean value, it is treated as having a VARCHAR data type. See Boolean Data Type section below for more information.

Numeric Data Types

  • NUMERIC

  • DECIMAL

  • DEC

  • INTEGER

  • INT

  • INT2

  • INT4

  • INT8

  • SMALLINT

  • BIGINT

Note that Octo does not differentiate between the various integer types listed above, internally treating them all as a single integer type. Similarly, Octo treats NUMERIC and various decimal types interchangably under a single numeric type.

Details about the range and accuracy of both DECIMAL/NUMERIC and INTEGER types can be found in the YottaDB M Programmer's Guide.

NUMERIC, DECIMAL and DEC can optionally be followed by a precision value in parentheses. Example: dec(10).

Note

The specified precision values are ignored when queries are executed.

Boolean Data Type

Octo uses 0 and 1 internally to represent boolean false and true respectively. However true and false can be used in Octo queries in the following manner:

SELECT *
FROM names
WHERE true;

In addition to accepting false and true as boolean literals, Octo also treats the single-quoted string literals in the following table as having a BOOLEAN data type (and not VARCHAR data type) if used in a context that expects a boolean value. Note though that if the same special literal are not used in a context that expects a boolean value, it is treated as having a VARCHAR data type (the default data type for single-quoted string literals).

String Literal

Boolean value

'f'

false

'false'

false

'n'

false

'no'

false

'0'

false

't'

true

'true'

true

'y'

true

'yes'

true

'1'

true

For boolean values, the following predicates are available (e.g. in WHERE clauses):

boolean_expression IS TRUE
boolean_expression IS NOT TRUE
boolean_expression IS FALSE
boolean_expression IS NOT FALSE
boolean_expression IS UNKNOWN
boolean_expression IS NOT UNKNOWN

The expressions TRUE/FALSE are self-explanatory; IS UNKNOWN/IS NOT UNKNOWN is the same as IS NULL/IS NOT NULL, but only for boolean values.

Here's a small example:

DROP TABLE IF EXISTS TEST;
CREATE TABLE TEST (
        ID INTEGER PRIMARY KEY,
        FOO BOOLEAN
);
INSERT INTO TEST VALUES (1, TRUE), (2, FALSE), (3, NULL);
SELECT * FROM TEST WHERE FOO IS TRUE;
SELECT * FROM TEST WHERE FOO IS FALSE;
SELECT * FROM TEST WHERE FOO IS UNKNOWN;

Date and Time Data Types

Octo implements date and time types conforming to Gregorian calendar.

Date and Time types can be specified in the following formats:

  • TEXT (refers to values like '2023-01-01' and '01:01:01')

  • HOROLOG (refers to values equivalent to output of $HOROLOG)

  • ZHOROLOG (refers to values equivalent to output of $ZHOROLOG)

  • ZUT (refers to values equivalent to output of $ZUT)

  • FILEMAN (refers to values of the format YYYMMDD.HHMMSS)

    • Where YYY is year since 1700 with 000 not being allowed

    • MM is month

    • DD is date

    • HH is hour

    • MM is minute

    • SS is second

    • Month and Date have to be in 2 digits

Different types that are available are:

  • DATE

  • TIME [WITHOUT TIME ZONE]

  • TIMESTAMP [WITHOUT TIME ZONE]

  • TIME WITH TIME ZONE

  • TIMESTAMP WITH TIME ZONE

TIMESTAMP will have both date and time information. TIME can have seconds specified with subseconds up till microsecond precision.

TIME WITH TIME ZONE and TIMESTAMP WITH TIME ZONE values are stored in UTC and displayed in the active time zone.

WITHOUT TIME ZONE usage is optional for TIME AND TIMESTAMP types.

TIME WITH TIME ZONE is not well supported at this time. As date information is not accepted the time given will be treated as the current date's time.

Note some formats are not allowed for some types. Such inputs are disallowed, an error is issued upon its usage. Following lists all such cases:

  • FILEMAN and TIME

  • FILEMAN and TIME WITH TIME ZONE

  • ZUT and TIME

  • ZUT and TIME WITH TIME ZONE

  • ZUT and TIMESTAMP WITH TIME ZONE

Input of such type results in an error. In case of output, an empty value will be seen in the result table where the column value type and output format ends up matching any of the above combinations.

Literals

Date and time literals can be specified in the following format:

  • type[(format)]'literal'

  • format when not specified is considered to be text. text cannot be explicitly specified.

    Examples:

    date '2023-01-01'
    date(horolog)'66475'
    date(zhorolog)'66475,,,'
    date(fileman)'2320101'
    date(zut)'1701123704344883'
    timestamp '2023-01-01 01:01:01'
    timestamp '2023-01-01T01:01:01'
    timestamp(horolog)'66475,3661'
    timestamp(fileman)'3230101.010101'
    timestamp with time zone '2023-01-01 01:01:01.567555-05:00'
    timestamp with time zone '2023-01-01 01:01:01.567555-08:30'
    timestamp with time zone '2023-01-01T01:01:01.567555-08:30'
    timestamp with time zone '2023-01-01 01:01:01-08:30'
    timestamp(zhorolog) with time zone '66885,44246,905676,18000'
    time '01:01:01.567555'
    time 'T01:01:01.567555'
    
  • Subsecond information is specified after the second with a period as shown in examples above. Subseconds after microsecond precision are ignored.

  • Time zone information is specified after seconds without space and can be a value in the range -15:59:59 to +15:59:59. : is required between hours, minutes and seconds. Time zone fields have to be specified with 2 digits. Minutes and seconds are optional. +00:00 or +00 should be used for UTC.

Columns and Function definition

Date and time column specification in CREATE TABLE is in the following format:

  • column_name type[(format)]

  • format when not specified is considered to be text. text cannot be explicitly specified.

    Examples:

    create table test (id integer, dob date);
    create table test (id integer, tob time(horolog));
    create table test (id integer, tos timestamp(fileman));
    create table test (id integer, tos timestamp(zhorolog) with time zone);
    

Date and time type specification in CREATE FUNCTION is similar.

Examples:

create function testf(integer, date) returns date as $$date^%test;
create function testf(integer, time(horolog)) returns time(horolog) as $$time^%test;
create function testf(integer, timestamp(fileman)) returns timestamp(fileman) as $$timestamp^%test;

Output format

Output format for date and time values can be specified through octo.conf by adding the following key value pair.

datetimeoutputformat = "value"

"value" can be "horolog", "zhorolog", "zut", "fileman", "text". By default output format will be "text".

If output format and columns of result table have date/time type and format combination that is disallowed then result table for such columns will display NULL values.

Input and Output text format value

datestyle configuration controls the pattern in which date and time values in TEXT format are accepted as input and seen in output.

By default the value of datestyle is the following

ISO, YMD

Where the output form of date and time value is ISO and the input form is YMD.

The output form corresponds to date/time values following the ISO 8601 (SQL) standard:YYYY-MM-DD HH:MM:SS. Since this is the only form of output supported by Octo it is not mandatory to include ISO while setting datestyle.

The input form determines how a date value needs to be formed .i.e. in which order day, month and year can be specified in a literal or global value. In YMD, Y the year, M represents month and D the date. They can be specified in any order and in upper or lower case.

Following are some examples of datestyle values:

ISO, YMD
MDY
YMD
DMY
MDY, ISO

When setting datestyle a comma separator is needed if both input and output form is being set. Space is optional and the order of input and output form doesn't matter.

datestyle configuration can be set using SET command or in octo.conf.

Following shows an example key value pair to include in octo.conf to set datestyle. Note, double quotes must be used for the value.

datestyle = "ISO, YMD"

Following shows an example of how SET command can be used to set datestyle. Note, single quotes must be used for the value.

SET datestyle='ISO, YMD'

Note: - must be used as delimiter for date and : for time.

Following are some examples of date/time literals:

-- when datestyle is 'ymd'
date'2023-01-01'

-- when datestyle is 'mdy'
date'01-01-2023'

datestyle default value when MYSQL emulation is being done is also ISO, YMD.

Readonly Table

For globals to be mapped with date and time type columns, the values should be in the following format:

DATE

TEXT     - value has to be in the format YYYY-MM-DD where YYYY is year, MM is month, DD is date. Ordering has to match ``datestyle``.
FILEMAN  - YYYMMDD
HOROLOG  - 1st part of $HOROLOG, example: 66475
ZHOROLOG - value has to be in $ZHOROLOG format. Empty values are accepted. Example: "66787,43301,," or  "66787,,,"
ZUT      - value equivalent to $ZUT output

TIME

TEXT     - value has to be in HH:MM:SS.UUUUUU format where HH is hours, MM is minutes, SS is seconds and UUUUUU is subsecond. ``THH`` instead of ``HH`` is allowed as well.
FILEMAN  - not allowed
HOROLOG  - 2nd part of $HOROLOG, example: 3661
ZHOROLOG - value has to be in $ZHOROLOG format. Empty values are accepted. Example: "66787,3661,," or  ",3661,,"
ZUT      - not allowed

TIMESTAMP

TEXT     - value has to follow the syntax:``"DATE TIME"`` or ``"DATEtTIME"``. Date and Time itself are specified as shown in ``DATE`` and ``TIME`` sections above
FILEMAN  - YYYMMDD.HHMMSS
HOROLOG  - value equivalent to $HOROLOG, example: "66475,3661"
ZHOROLOG - value equivalent to $ZHOROLOG, example: "66795,53635,237558,"
ZUT      - value equivalent to $ZUT output, example: "1700250859368731"

TIME WITH TIME ZONE

TEXT     - value has to follow the syntax:``"TIME+/-TIMEZONE"`` where TIMEZONE can be -15:59:59 to +15:59:59 and TIME as specified in the ``TIME`` section above
FILEMAN  - not allowed
HOROLOG  - 2nd part of $HOROLOG, example: 3661
ZHOROLOG - value has to be in $ZHOROLOG format. Empty values are accepted. Example: ",3661,,18000" or  ",3661,,"
ZUT      - not allowed

TIMESTAMP WITH TIME ZONE

TEXT     - value has to follow the syntax:``"DATE TIME+/-TIMEZONE"`` or ``"DATEtTIME+/-TIMEZONE. Refer to previous ``DATE``,``TIME`` and ``TIME WITH TIME ZONE`` sections for the syntax of each field
FILEMAN  - YYYMMDD.HHMMSS
HOROLOG  - value equivalent to $HOROLOG, example: "66475,3661"
ZHOROLOG - value equivalent to $ZHOROLOG, example: "66795,53635,237558,18000"
ZUT      - not allowed

Literals also have to follow the above rules to specify a date/time type value of a particular format.

An important note about date/time columns in table definition

  • When a table is readwrite, date/time value inserted into the table will be stored in a global in Octo's internal date/time format

  • If a table is mapped to a pre-existing global and there are no keywords in the table definition which makes the table read-only, the table is considered read-write. This will lead to date/time column values in the global node being incorrectly treated as Octo's internal format values when they clearly are not. To avoid such a problem, ensure that when mapping to a pre-existing global readonly is used in the table definition.

  • datestyle needs to be properly set when accessing date or timestamp type column values from readonly tables. The current datestyle setting needs to match the format of the date/timestamp stored in the global node. Or else the column value will show up incorrectly.

Range

[] - is used to indicate optional values.

Range for text format input:

Date                     : 0000-01-01 to 9999-12-31
Time                     : 00:00:00.000000 to 23:59:59.999999
Time with time zone      : 00:00:00.000000-15:59:59 to 23:59:59.999999+15:59:59
Timestamp                : 0000-01-01 00:00:00.000000 to 9999-12-31 23:59:59.999999
Timestamp with time zone : 0000-01-01 00:00:00.000000-15:59:59 to 9999-12-31 23:59:59.999999+15:59:59 (01-JAN-0000 00:00:00.000000-15:59:59 to 31-DEC-9999 23:59:59.999999+15:59:59)

Range for Horolog format input:

Date     : -365 (01-JAN-1840) to 2980013 (31-DEC-9999)
Time     : 0 (00:00:00) to 86399 (23:59:59)
Timestamp: -365,0 (01-JAN-1840 00:00:00) to 2980013,86399 (31-DEC-9999 23:59:59)

Range for Fileman format input:

Date     : 0010101 (01-JAN-1701) to 9991231 (31-DEC-2699)
Timestamp: 0010101.000000 (01-JAN-1701 00:00:00) to 9991231.235959 (31-DEC-2699 23:59:59)

Range for ZHorolog format input:

Date     : -365,,, to 2980013,,, (01-JAN-1840 to 31-DEC-9999)
Time     : ,0,[0], to ,86399,[999999], (00:00:00.00000) to 23:59:59.999999)
Time with time zone: ,0,[0],43200 to ,86399,[999999],-50400 (00:00:00.00000-12:00) to 23:59:59.999999+14:00)
Timestamp: -365,0,[0], to 2980013,86399,[999999], (01-JAN-1840 00:00:00.000000 to 31-DEC-9999 23:59:59.999999)
Timestamp with time zone: -365,0,[0],43200 to 2980013,86399,[999999],-50400 (01-JAN-1840 00:00:00.000000-12:00 to 31-DEC-9999 23:59:59.999999+14:00)

Range for ZUT format input:

Date: -62167219200000000 (0000-01-01) to 253402214400000000 (9999-12-31)
Timestamp: -62167219200000000 (0000-01-01 00:00:00.000000) to 253402300799999999(9999-12-31 23:59:59.999999)

Operations and allowed operand types

+

DATE + TIME
TIME + DATE
DATE + TIMEWITHTIMEZONE
TIMEWITHTIMEZONE + DATE
DATE + INTEGER
INTEGER + DATE
TIMEWITHTIMEZONE + TIME
TIME + TIMEWITHTIMEZONE
TIME + TIMESTAMP
TIMESTAMP + TIME
TIME + TIMESTAMPWITHTIMEZONE
TIMESTAMPWITHTIMEZONE + TIME
NULL + TIME
TIME + NULL
NULL + TIMESTAMP
TIMESTAMP + NULL

-

DATE - DATE
DATE - TIME
DATE - INTEGER
DATE - NULL
TIMEWITHTIMEZONE - TIME
TIMESTAMP - TIME
TIMESTAMPWITHTIMEZONE - TIME
TIMEWITHTIMEZONE - NULL
TIMESTAMP - NULL

||

  • Only valid when one of the operand is of type VARCHAR or has the value NULL

>, >=, <, <=, =, !=, ANY(), ALL(), BETWEEN, NOT BETWEEN, IN, NOT IN, IS NULL, IS NOT NULL

  • TIME and TIME WITH TIME ZONE can only be compared amongst themselves

  • Rest all types can be compared with each other

  • NULL can be included in the comparison

EXISTS and NOT EXISTS unary operation

  • Its working is similar to other types

CAST operation

  • :: and CAST

  • Valid conversions are

    DATE or TIMESTAMP or TIMESTAMPWITHTIMEZONE                       to DATE
    TIME or TIMEWITHTIMEZONE or TIMESTAMP or TIMESTAMPWITHTIMEZONE   to TIME
    DATE or TIMESTAMP or TIMESTAMPWITHTIMEZONE                       to TIMESTAMP
    TIME or TIMEWITHTIMEZONE                                         to TIMEWITHTIMEZONE
    DATE or TIMESTAMP or TIMESTAMPWITHTIMEZONE                       to TIMESTAMPWITHTIMEZONE
    STRING                                                           to any date/time type (if the value can be converted)
    
    • Input needs to be in text format for STRING to date/time conversion. Other conversions can have any of the internal formats as input.

    • Output will be in the date/time output format specified in octo.conf

  • :: can also be used to print a date/time value in zut, zhorolog, horolog or fileman. The returned value will be a varchar.

    Example:
    OCTO> select date'2023-01-01'::date(fileman);
    DATE
    3230101
    (1 row)
    

CASE statements

  • Its working is similar to other types

ARRAY statement

  • Its working is similar to other types

Aggregate functions

  • COUNT, MAX and MIN are supported

  • DISTINCT usage is also valid

  • table.* and * usage in the above aggregate functions are also valid

NULL_IF, COALESCE, GREATEST, LEAST

  • DATE and TIMESTAMP types can exists together as input to this function.

Date/time Functions

  • NOW() -> returns TIMESTAMP WITH TIME ZONE

  • LOCALTIME -> returns TIME

  • LOCALTIMESTAMP -> returns TIMESTAMP

  • CURRENT_TIME -> returns TIME WITH TIME ZONE

  • CURRENT_TIMESTAMP -> returns TIMESTAMP WITH TIME ZONE

  • DAY(DATE) -> returns VARCHAR

  • DAYOFMONTH(DATE) -> returns VARCHAR

  • DATE_FORMAT(TIMESTAMP,VARCHAR) -> returns VARCHAR

  • date_to_fileman(DATE) -> returns NUMERIC

  • timestamp_to_fileman(TIMESTAMP) -> returns NUMERIC

  • timestamptz_to_fileman(TIMESTAMP WITH TIME ZONE) -> returns NUMERIC

  • date_to_horolog(DATE) -> returns VARCHAR

  • time_to_horolog(TIME) -> returns VARCHAR

  • timestamp_to_horolog(TIMESTAMP) -> returns VARCHAR

  • timetz_to_horolog(TIME WITH TIME ZONE) -> returns VARCHAR

  • timestamptz_to_horolog(TIMESTAMP WITH TIME ZONE) -> returns VARCHAR

  • date_to_zhorolog(DATE) -> returns VARCHAR

  • time_to_zhorolog(TIME) -> returns VARCHAR

  • timestamp_to_zhorolog(TIMESTAMP) -> returns VARCHAR

  • timetz_to_zhorolog(TIME WITH TIME ZONE) -> returns VARCHAR

  • timestamptz_to_zhorolog(TIMESTAMP WITH TIME ZONE) -> returns VARCHAR

  • date_to_zut(DATE) -> returns INTEGER

  • timestamp_to_zut(TIMESTAMP) -> returns INTEGER

Note

TIME WITH TIME ZONE will consider time to be for the current date and is converted to UTC and back to local time zone

INSERT statement can add values which don't exactly match the column type. Following are the usecases which are valid.

For a date column following types can be inserted
* DATE/TIMESTAMP/TIMESTAMP WITH TIME ZONE

For a time column following types can be inserted
* TIME/TIME WITH TIME ZONE/TIMESTAMP/TIMESTAMP WITH TIME ZONE

For a time with time zone column following types can be inserted
* TIME WITH TIME ZONE/ TIME

For a timestamp column following types can be inserted
* TIMESTAMP/ TIMESTAMP WITH TIME ZONE/ DATE

For a timestamp with time zone column following types can be inserted
* TIMESTAMP/ TIMESTAMP WITH TIME ZONE/ DATE

In other cases a type mismatch error is seen. Following lists all the mismatch cases.

For a DATE column following types cannot be inserted
* TIME/TIME WITH TIME ZONE

For a TIME column following types cannot be inserted
* DATE

For a TIMESTAMP column following types cannot be inserted
* TIME/TIME WITH TIME ZONE

For a TIME WITH TIME ZONE column following types cannot be inserted
* DATE/TIMESTAMP/TIMESTAMP WITH TIME ZONE

For a TIMESTAMP WITH TIME ZONE column following types cannot be inserted
* TIME/TIME WITH TIME ZONE

Following examples show the error seen in this case.

OCTO> create table test (id int, dt date);
CREATE TABLE
OCTO> insert into test values(1, time'01:01:01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "dt" is of type DATE but expression is of type TIME
OCTO> insert into test values(1, time with time zone'01:01:01-05');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "dt" is of type DATE but expression is of type TIME WITH TIME ZONE

OCTO> create table test (id int, t time);
CREATE TABLE
OCTO> insert into test values(1, date'2023-01-01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIME but expression is of type DATE

OCTO> create table test (id int, t timestamp);
CREATE TABLE
OCTO> insert into test values(1, time'01:01:01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIMESTAMP but expression is of type TIME
OCTO> insert into test values(1, time with time zone'01:01:01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIMESTAMP but expression is of type TIME WITH TIME ZONE

OCTO> create table test (id int, t time with time zone);
CREATE TABLE
OCTO> insert into test values(1, date'2023-01-01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIME WITH TIME ZONE but expression is of type DATE
OCTO> insert into test values(1, timestamp'2023-01-01 01:01:01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIME WITH TIME ZONE but expression is of type TIMESTAMP
OCTO> insert into test values(1, timestamp with time zone'2023-01-01 01:01:01-05');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIME WITH TIME ZONE but expression is of type TIMESTAMP WITH TIME ZONE

OCTO> create table test (id int, t timestamp with time zone);
CREATE TABLE
OCTO> insert into test values(1, time'01:01:01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIMESTAMP WITH TIME ZONE but expression is of type TIME
OCTO> insert into test values(1, time with time zone'01:01:01');
[ERROR]: ERR_INSERT_TYPE_MISMATCH: Column "t" is of type TIMESTAMP WITH TIME ZONE but expression is of type TIME WITH TIME ZONE

In case of a readonly table, date/time column values having empty strings are treated as NULL values.

In-exact dates and timestamps of FILEMAN format are accepted but are converted to the nearest date or timestamp values. For example: date(fileman)'3230000' and date(fileman)'3230100' are considered to be equivalent to date(fileman)'3230101'. date(fileman)'3230001' is a similar value but is invalid.

FILEMAN midnight value is accepted but is converted to the nearest date. For example: timestamp(fileman)'2960714.24' is treated as timestamp(fileman)'2960715.000000'.

FILEMAN values where the time field (HHMMSS in YYYMMDD.HHMMSS) is only 1 digit is considered as HH with a trailing 0 when value is 1 or 2. A single digit value greater than 2 is invalid. Also, a 2 digit HH without MM and SS having 10 or 20 is invalid. If time field is 3 digits then the value is treated as HHMM where MM is 1 digit and its value cannot exceed 5. If time field is 5 digits then it is considered as HHMMSS where SS is single digit and its value cannot exceed 5. A valid single digit MM (in 3 digit HHMM value) and SS (in 5 digit HHMMSS value) is appended with a trailing 0. For example:

timestamp(fileman)'2960124.1' -> timestamp(fileman)'2960124.100000'
timestamp(fileman)'2960124.16263' -> timestamp(fileman)'2960124.162630'
timestamp(fileman)'2960124.162'   -> timestamp(fileman)'2960124.162000'

FILEMAN values with only year without any other fields are not allowed as literals and are treated as NULL values when present in globals. For example: date(fileman)'323' generates an error. If similar value is in a global and date(fileman) column is mapped to it then a select on the column will give a NULL value.

All FILEMAN literals that are invalid generate an ERROR. FILEMAN values in globals or those formed during conversion from other format values to FILEMAN are treated as NULL if they are found to be invalid.

When a FILEMAN value having time information is assigned to a DATE(FILEMAN) type, the value is validated by considering only the DATE portion. Time portion is ignored.

ZUT formatted DATE or TIMESTAMP refers to UTC time and not the present timezone time.

User defined functions with date/time parameters will receive date/time values in the format specified. Also, when text format input is expected, date/time value passed to the function implementation will be in the input form seen in datestyle. The output of the function is expected to be in the output form seen in datestyle.

Casting between SQL types

You can cast SQL values from one type to another using either the :: operator, or the CAST() function. For example:

-- `::` usage
select '1'::INTEGER;  -- Returns 1

-- `CAST()` usage
select CAST(1.1 AS INTEGER);  -- Returns 1
select CAST(1 AS BOOLEAN);  -- Returns 1

Constraints

PRIMARY KEY

A primary key constraint indicates that a column or group of columns can be used as a unique identifier for the rows in the table. The values of the columns that will be specified as the PRIMARY KEY must be both unique and not null.

Example:

CREATE TABLE Products
(ID int PRIMARY KEY,
Name char(20),
Price int);

Primary keys can span more than one column; this is referred to as table-level PRIMARY KEY.

CREATE TABLE Employee
(ID int,
FirstName char(20),
LastName char(30),
PRIMARY KEY (ID, FirstName, LastName));

Adding a primary key will enforce the column(s) to be marked NOT NULL.

A table can have at most one primary key. While relational database theory requires the existence of a primary key on each table, Octo does not enforce this rule.

CHECK constraint

A check constraint enforces that each value in the column must satisfy the given Boolean expression. It consists of the keyword CHECK followed by the expression in parentheses.

Example:

CREATE TABLE Products
(ID int PRIMARY KEY,
Name char(20) NOT NULL,
Price int CHECK (Price > 0));

The above example CREATEs a table named Products where the CHECK constraint is applied to the Price column, enforcing the check that every products price must be greater than 0.

The constraint can also be given a separate name, for example:

CREATE TABLE Products
(ID int PRIMARY KEY,
Name char(20) NOT NULL,
Price int CONSTRAINT price_check CHECK (Price > 0));

A check constraint can also combine or refer to several columns. For example:

CREATE TABLE Products
(ID int PRIMARY KEY,
Name char(20) NOT NULL,
Price int CHECK (Price > 0),
Discounted_price int CHECK (Discounted_price>0),
CHECK (Price > Discounted_price));

The above example CREATEs a table named Products where a CHECK constraint is applied to the Price column and the Discounted_price column separately, and on both the columns together as well. The constraints in the example above that are part of the column definition are referred to as column constraints; constraints that are written separately from any column definition, (e.g., CHECK (Price > Discounted_price)) are referred to as table constraints.

Note

A column constraint can be written as a table constraint while the reverse is not possible.

NOT NULL is also considered a constraint. A CHECK syntax and a constraint name can be applied to NOT NULL, but both those uses are discouraged.

IDENTITY

An IDENTITY keyword in the column specification of a CREATE TABLE indicates that the column is defined to be an auto-incrementing column. IDENTITY can only be applied on a column of type INTEGER. When an IDENTITY column is specified, a valid INSERT or UPDATE query will set an auto-incremented value as the column value in the row being created/updated. An auto-incrementing column implicitly has a NOT NULL constraint.

There are two types of IDENTITYs, a GENERATED ALWAYS AS IDENTITY which always sets an auto-incremented value to the column and a GENERATED BY DEFAULT AS IDENTITY which sets an auto-incremented value if no value is specified for the column.

CREATE TABLE table_name(.., column_name INTEGER GENERATED ALWAYS AS IDENTITY, ..);
CREATE TABLE table_name(.., column_name INTEGER GENEREATED BY DEFAULT AS IDENTITY, ..);

Following are a few examples of an identity column's working:

OCTO> CREATE TABLE Test(Id INTEGER GENERATED ALWAYS AS IDENTITY, Name TEXT);
CREATE TABLE

OCTO> INSERT INTO Test(Name) VALUES('first');
INSERT 0 1

OCTO> SELECT * FROM Test;
ID|NAME
1|first
(1 row)

OCTO> INSERT INTO Test VALUES(99,'second');
[ERROR]: ERR_INSERT_ON_GENERATED_ALWAYS_IDENTITY: Cannot INSERT into GENERATED ALWAYS identity column 'TEST.ID'. Use OVERRIDING SYSTEM VALUE to override.

OCTO> CREATE TABLE Test2 (Id INTEGER GENERATED BY DEFAULT AS IDENTITY, Name TEXT);
CREATE TABLE

OCTO> INSERT INTO Test2 VALUES(99,'first');
INSERT 0 1

OCTO> SELECT * FROM Test2;
ID|NAME
99|first
(1 row)

OCTO> INSERT INTO Test2(Name) VALUES('second');
INSERT 0 1

OCTO> SELECT * FROM Test2;
ID|NAME
99|first
1|second
(2 rows)

To override the default behaviors of IDENTITY columns, INSERT is provided OVERRIDING USER VALUE and OVERRIDING SYSTEM VALUE. The former provides a way to ignore user specified value for a GENERATED BY DEFAULT AS IDENTITY column and the latter allows explicit value to be specified for a GENERATED ALWAYS AS IDENTITY column. OVERRIDING SYSTEM VALUE usage on a GENERATED BY DEFAULT identity column does nothing and its usage on a GENERATED ALWAYS identity column without an explicit user value specified will just make use of the auto-increment value. Following are a few example usages (these build on the previous example in this section).

OCTO> INSERT INTO Test OVERRIDING SYSTEM VALUE VALUES(99, 'second');
INSERT 0 1

OCTO> SELECT * FROM Test;
ID|NAME
1|first
99|second
(2 rows)

OCTO> INSERT INTO Test2 OVERRIDING USER VALUE VALUES(100, 'third');
INSERT 0 1

OCTO> SELECT * FROM Test2;
ID|NAME
99|first
1|second
2|third
(3 rows)

Similarly, UPDATE is provided with DEFAULT keyword. This can be specified in the right hand side of a SET clause. When a column is set to DEFAULT, UPDATE will use auto-incremented value as the column value. Note that DEFAULT can only be specified for a column which is defined as IDENTITY.

UPDATE table_name SET column_name = DEFAULT, ...

Following is an example for updating an identity column with DEFAULT value (it builds on the previous example in this section)

OCTO> UPDATE Test SET Id = DEFAULT where Id = 99;
UPDATE 1

OCTO> SELECT * FROM Test;
ID|NAME
1|first
2|second
(2 rows)

OCTO> UPDATE Test2 SET Id = DEFAULT where Id = 99;
UPDATE 1

OCTO> SELECT * FROM Test2;
ID|NAME
3|first
1|second
2|third
(3 rows)

Maintaining Consistency

Data in YottaDB, the datastore for Octo, can be updated both by SQL INSERT, UPDATE, and DELETE statements as well as directly in YottaDB outside Octo. If your application does both, then it must ensure that the latter direct updates respect the constraints of Octo CREATE TABLE statements.

CREATE TABLE

CREATE TABLE [IF NOT EXISTS] table_name
(column_name data_type [constraints][, ... column_name data_type [constraints]])
[optional_keyword];

The CREATE TABLE statement is used to create tables in the database. The keywords CREATE TABLE are used followed by the name of the table to be created.

If IF NOT EXISTS is supplied for a CREATE TABLE statement and a table exists, the result is a no-op with no errors. In this case, error type INFO_TABLE_ALREADY_EXISTS is emitted at INFO log severity level.

The names of columns to be created in the database and their datatypes are then specified in a list, along with any constraints that might need to apply (such as denoting a PRIMARY KEY, UNIQUE KEY, FOREIGN KEY, NOT NULL or CHECK).

In READONLY tables, if none of the columns are specified as keys (PRIMARY KEY or KEY NUM not specified in any column) then the primary key for the table is assumed to be the set of all columns in the order given. In READWRITE tables, an invisible field is created to act as the primary key.

Example:

CREATE TABLE Employee
(ID int PRIMARY KEY,
FirstName char(20),
LastName char(30));

CREATE TABLE Employee
(ID int,
FirstName char(20),
LastName char(30));
/* is equivalent to */
CREATE TABLE Employee
(ID int,
FirstName char(20),
LastName char(30),
PRIMARY KEY (ID, FirstName, LastName));

By default, a column can have NULL values. The NOT NULL constraint enforces a column to not accept NULL values when a new row is added as part of a INSERT INTO command or an existing row is updated using a UPDATE command. Since both these commands can be run only on READWRITE tables, the NOT NULL keyword is ignored in the case of READONLY tables.

Example:

CREATE TABLE Employee
(ID int PRIMARY KEY,
FirstName char(20) NOT NULL,
LastName char(30) NOT NULL);

The above example CREATEs a table named Employee, where the FirstName and LastName columns cannot accept NULL values.

Note that CREATE TABLE statements can also accept a list of ASCII integer values for use in the DELIM qualifier, for example:

CREATE TABLE DELIMNAMES
(id INTEGER PRIMARY KEY,
firstName VARCHAR(30),
lastName VARCHAR(30),
middleInitial VARCHAR(1),
age INTEGER)
DELIM (9, 9) GLOBAL "^delimnames";

Here, two TAB characters (ASCII value 9) act as the internal delimiter of an Octo table. Note, however, that these delimiters are not applied to Octo output, which retains the default pipe | delimiter. The reason for this is that tables may be joined that have different delimiters, so one common delimiter needs to be chosen anyway. Thus, the default is used.

Note:

  • Table and column names may be specified as either unquoted identifiers, e.g. id or mytable, or as double-quoted identifiers, e.g. "id" or "mytable".

  • Unquoted identifiers are case insensitive and internally cast to lowercase, while double-quoted identifiers are case sensitive

  • Double-quoted identifiers may contain spaces and/or SQL keywords

  • Underscores are allowed in all identifiers

optional_keyword

An optional keyword can be added to further enhance the CREATE TABLE statement:

[ AIMTYPE | DELIM | END | ENDPOINT | EXTRACT | GLOBAL | KEY NUM | NOT NULL | PIECE | READONLY | READWRITE | START | STARTINCLUDE ]

The keywords denoted above are M expressions and literals. They are explained in the following table:

Keyword

Type

Range

Purpose

Overrides

Default Value

AIMTYPE

Integer Literal

Table

By default, when Octo uses a YottaDB component called AIM to created indexes, it does not include NULL data in the index. This can only happen when the data stored by Octo is on multiple levels, such as VistA data. Specifying "1" for a table means that we want data that is stored on other levels but is not currently present to be considered NULL. If you specify "1", the region housing the table MUST have NULL subscripts in globals disabled.

0

DELIM

Literal

Table, Column

Represents the delimiter string to be used in $PIECE() when obtaining the value of a particular column from the global variable node that stores one row of the SQL table. When specified at the column level, an empty delimiter string (DELIM "") is allowed. In this case, the entire global variable node value is returned as the column value (i.e. no $PIECE is performed).

table/default DELIM setting

"|"

END

Boolean expression

Table

A condition that is tested to see if the cursor has gone past the last record in the table. If the condition evaluates to TRUE then that is considered past the last record in the table.

Not applicable

""=keys(0)

ENDPOINT

Literal

Column

Include all records including this value but not any value after it.

Not applicable

""=keys(0)

EXTRACT

Expression

Column

Gets data based on the M expression or SQL function call following the EXTRACT keyword. M expressions may reference key columns of the current table by placing the column name in a keys() expression, e.g. keys(""id"") to reference a key column named id. Note the duplication of double quotes, which are necessary to escape double quotes within the double quote delimited containing string literal. Similarly, non-key columns may be referenced in M expressions using a values() expression, e.g. values(""firstName""). A SQL function call following the EXTRACT keyword may reference any column in the table, as long as the referenced column is not an EXTRACT column that refers back to the one referencing it, i.e. no circular dependencies.

PIECE, GLOBAL

Not applicable

GLOBAL

Literal

Table, Column

Represents the "source" location for a table. It consists of a global name followed by an optional list of subscripts. One may refer to a key column in the subscript by specifying keys("colname") where colname is the name of the key column. Note that key column names in keys(..) are case sensitive, regardless of whether the key column name itself is case sensitive. Note also that in the case of a READONLY table, if no key columns are specified, all columns in the order specified are automatically assumed to be key columns. In case of a READWRITE table, if no key columns are specified, a hidden key column is created by Octo with the name %YO_KEYCOL. See examples in this document for how you can construct the GLOBAL keyword. If the Table-level GLOBAL keyword specifiesa global name with no subscripts, Octo adds subscripts to it one for every key column that is explicitly specified or automatically assumed/generated but if the Column-level GLOBAL keyword specifies a global name with no subscripts no such automatic subscript addition takes place.

table/default GLOBAL setting

^%ydboctoD_$zysuffix(tablename)(keys("colname")) where tablename is the table name and colname is the name of the primary key column. If more than one key column exists, they will form more subscripts. For example, if keycol is a column that is specified with a PRIMARY KEY keyword and keycol2 is an additional column specified with a KEY NUM 1 keyword, then the default value would be ^%ydboctoD...(keys("keycol"),keys("keycol2"))

KEY NUM

Integer Literal

Column

Specifies an integer indicating this column as part of a composite key. The PRIMARY KEY column correponds to KEY NUM 0. The first key column is specified with a PRIMARY KEY keyword. All other key columns are specified with a KEY NUM keyword with an integer value starting at 1 and incrementing by 1 for every key column. Such a column is considered a key column and is part of the subscript in the global variable node that represents a row of the table. KEY NUM is legacy code that is required by VistA. Other users should use PRIMARY KEY instead.

Not applicable

Not applicable

NOT NULL

Not applicable

Column

Indicates that this column cannot take on a NULL value. An error is issued if an INSERT INTO or UPDATE command tries to set this column to a NULL value for READWRITE tables. For READONLY tables this keyword is ignored.

Not applicable

Not applicable

PIECE

Integer Literal

Column

Represents a piece number. Used to obtain the value of a column in a table by extracting this piece number from the value of the global variable node specified by the GLOBAL keyword at this column level or at the table level. The generated code does a $PIECE() on the value to obtain the value. See also DELIM keyword for the delimiter string that is used in the $PIECE.

default (column number, starting at 1 for non-key columns)

Not applicable

READONLY

Not applicable

Table

Specifies that the table maps to an existing YottaDB global variable and allows use of various keywords like START, END etc. in the same CREATE TABLE command. Queries that update tables like INSERT INTO, DELETE FROM etc. are not allowed in such tables. DROP TABLE command drops the table and leaves the underlying mapping global variable nodes untouched.

Not applicable

tabletype setting in octo.conf

READWRITE

Not applicable

Table

Is the opposite of the READONLY keyword. This allows queries that update tables like INSERT INTO, DELETE FROM etc. but does not allow certain keywords like START, END etc. in the same CREATE TABLE command. That is, it does not allow a lot of flexibility in mapping like READONLY tables do. But queries that update tables like INSERT INTO, DELETE FROM etc. are allowed in such tables. And a DROP TABLE command on a READWRITE table drops the table and deletes/kills the underlying mapping global variable nodes.

Not applicable

tabletype setting in octo.conf

START

Command expression

Column

Indicates where to start a FOR loop (using $ORDER()) for a given key column in the table.

Not applicable

""

STARTINCLUDE

Not applicable

Column

If specified, the FOR loop (using $ORDER()) that is generated for every key column in the physical plan processes includes the START value of the key column as the first iteration of the loop. If not specified (the default), the loop does a $ORDER() of the START value and uses that for the first loop iteration.

Not applicable

Not specified

In the table above:

  • table_name and cursor_name are variables representing the names of the table and the cursor being used.

  • keys is a special variable in Octo that contains all of the columns that are identified as keys in the DDL (either via the "PRIMARY KEY" or "KEY NUM X" set of keywords).

If the same CREATE TABLE command specifies READONLY and READWRITE, the keyword that is specified last (in left to right order of parsing the command) prevails. If neither of these two options are specified and octo.conf does not specify tabletype = "READONLY", the table will be implicitly assumed to be READWRITE.

A table will become READONLY under the following conditions:

  • If END, ENDPOINT, EXTRACT, SOURCE, START, or STARTINCLUDE keywords are used in the CREATE statement

  • If the DELIM keyword is specified in the first non-key column and has a value other than ""

  • If the PIECE number is not the same as the column number (first column is 1, second column is 2, etc.)

  • If the GLOBAL keyword is specified with subscripts that are not in a format compatible with READWRITE

If a DELIM "" is specified for a column, any PIECE keyword specified for that column is ignored and is treated as if the keyword was not specified.

For ENDPOINT, you can specify literals, M style $CHAR data, or a space. Note that to specify a space, you need to say '" "'. For $CHAR(n), say '$CHAR(n)'. Note that if you specify an empty string ('""'), you will get no records. In this case you should just omit ENDPOINT.

You can combine END and ENDPOINT together. If you do so, both conditions are evaluated; however, the END condition is evaluated before the ENDPOINT condition.

Examples

CREATE TABLE Orders
(OrderID INTEGER PRIMARY KEY,
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
GLOBAL "^Orders";

In the above example, the Orders table maps data in the nodes of the global variable ^Orders. ^Orders has a single subscript, OrderID. Its nodes are strings, whose | separated pieces are, respectively, CustomerID, EmployeeID, OrderDate, and ShipperID, e.g., ^Orders(535088)="9015|57|2021-08-26|17". "|" is the default piece operator.

CREATE TABLE Orders
(OrderID INTEGER PRIMARY KEY,
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
DELIM "^"
GLOBAL "^Orders";

This example is similar to the last, except that the nodes of ^Orders are strings whose pieces are separated by "^", e.g., ^Orders(535088)="9015^57^2021-08-26^17".

CREATE TABLE USPresidents
(FirstYear INTEGER,
 LastYear INTEGER,
 FirstName VARCHAR,
 MiddleName VARCHAR,
 LastName VARCHAR,
 BirthYear INTEGER,
 DeathYear INTEGER,
 PRIMARY KEY (FirstYear, LastYear))
GLOBAL "^USPresidents";

In the above example, ^USPresidents has records like ^USPresidents(1933,1945)="Franklin|Delano|Roosevelt|1882|1945" and ^USPresidents(2009,2017)="Barack||Obama|1961".

CREATE TABLE PresidentNames
(ID INTEGER PRIMARY KEY,
 FName VARCHAR PIECE 2,
 LName VARCHAR PIECE 1)
GLOBAL "^PresidentNames";

In the above example, ^PresidentNames has records like ^Names(1)="Lincoln|Abraham" and ^Names(2)="Obama|Barack".

CREATE TABLE AuthorNames
(ID INTEGER PRIMARY KEY,
 LName VARCHAR ,
 FName VARCHAR EXTRACT "$PIECE(^AuthorNames(keys(""id"")),""^"",2)")
DELIM "^"
GLOBAL "^AuthorNames";

In the above example, ^AuthorNames has records like ^Names(1)="Dahl^Roald" and ^Names(2)="Blyton^Enid".

CREATE TABLE Orders
(OrderID INTEGER PRIMARY KEY,
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
GLOBAL "^Orders"
READONLY;

In the above example, the Orders table is set to be READONLY. If the Orders table is DROPped then the underlying mapped global variable node (^Orders) will be untouched.

CREATE TABLE Orders
(OrderID INTEGER PRIMARY KEY,
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
GLOBAL "^Orders"
READWRITE;

In the above example, the Orders table is set to be READWRITE. If the Orders table is DROPped then the underlying mapped global variable nodes (^Orders) will be deleted.

CREATE TABLE Orders
(OrderID INTEGER PRIMARY KEY START 0 END "$CHAR(0)]]keys(""orderid"")",
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
GLOBAL "^Orders";

In the above example, the START and END keywords tell Octo what subset of the ^Orders nodes with one subscript should be mapped to the Orders table. START 0 indicates that subscripts greater than 0 should be mapped, and END "$CHAR(0)]]keys(""orderid"")" restricts the mapping to numeric subscripts. Note that the column name is defined as OrderID but the keys() syntax uses the lower cased column name orderid. This is because Octo currently assumes any column name that is not specified inside double quotes or back quotes to be a lower cased name.

Rather than using END in the previous example, you can use the simpler ENDPOINT, which will achieve the same result (the below example illustrates that). ENDPOINT will traverse the global until it reaches the specified endpoint, and it will include the end point record as well. Most of the time, ENDPOINT should be used to reach the end of a numeric subscript range. Therefore, a good value to use is '$CHAR(0)' or '" "', as these sort after numbers.

CREATE TABLE Orders
(OrderID INTEGER PRIMARY KEY START 0 ENDPOINT '$CHAR(0)',
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
GLOBAL "^Orders";
CREATE TABLE Orders
("OrderID" INTEGER PRIMARY KEY START 1 END "'+keys(""OrderID"")" STARTINCLUDE,
 CustomerID INTEGER,
 EmployeeID INTEGER,
 OrderDate VARCHAR(16),
 ShipperID INTEGER)
GLOBAL "^Orders";

In the above example STARTINCLUDE is used with START and END. In this case the FOR loop for $ORDER() includes the START value of the key column as the first iteration of the loop. Note that in the above example, the column name OrderID is specified inside double quotes. This lets the column name be taken as is (with the mixed case lettering) and so we can use keys() syntax with the mixed case column name.

CREATE TABLE extractnames (
    id INTEGER PRIMARY KEY,
    firstName VARCHAR(30),
    lastName VARCHAR(30),
    age INTEGER,
    fullname VARCHAR EXTRACT "$$^FULLNAME(values(""firstname""),values(""lastname""))"
) GLOBAL "^names(keys(""id""))";
; FULLNAME.m
FULLNAME(firstname,lastname)
    quit firstname_" "_lastname

In the above example, EXTRACT is used to define a computed column that references non-key columns. Non-key columns are referenced in EXTRACT functions by passing the column name as an M string literal to an expression of the form values(..). Note that the column name is defined as firstName but the values() syntax uses the lower cased column name firstname. This is because Octo currently assumes any column name that is not specified inside double quotes or back quotes to be a lower cased name.

CREATE TABLE extractnames (
    id INTEGER PRIMARY KEY,
    firstName VARCHAR(30),
    lastName VARCHAR(30),
    age INTEGER,
    fullname VARCHAR EXTRACT CONCAT(firstName, ' ', lastName),
    nameandnumber VARCHAR EXTRACT CONCAT(lastName, id::varchar)
) GLOBAL "^names(keys(""id""))";

In the above example, EXTRACT is used to define a computed column using a SQL function, in this case CONCAT().

In this example, the fullname column calls CONCAT() with the firstName and lastName columns of the table, along with a string literal containing a space. Similarly, the nameandnumber column calls CONCAT() with the lastName column and the id column, which is typecast as a VARCHAR for compatibility with CONCAT(), which requires string type arguments.

For more advanced DDL mapping examples, see Advanced Mapping of YottaDB Globals to Octo Tables.

Error Case

Note

A CREATE TABLE waits for all other concurrently running queries(SELECT or CREATE TABLE or DROP TABLE) to finish so it can safely make DDL changes. It waits for an exclusive lock with a timeout of 10 seconds. If it fails due to a timeout, the user needs to stop all concurrently running queries and reattempt the CREATE TABLE statement.

CREATE FUNCTION

CREATE FUNCTION [IF NOT EXISTS] function_name
([data_type[, data_type[, ...]]])
RETURNS data_type AS extrinsic_function_name;

The CREATE FUNCTION statement is used to create SQL functions that map to extrinsic M functions and store these mappings in the database. The keywords CREATE FUNCTION are followed by the name of the SQL function to be created, the data types of its parameters, its return type, and the fully-qualified extrinsic M function name.

The name of the SQL function may be specified as either unquoted identifiers, e.g. id or mytable, or as double-quoted identifiers, e.g. "id" or "mytable". Unquoted identifiers are case insensitive and cast internally to lowercase, while double-quoted identifiers are case sensitive. Additionally, double-quoted identifiers may contain spaces and/or SQL keywords.

If IF NOT EXISTS is supplied for a CREATE FUNCTION statement and a function exists, the result is a no-op with no errors. In this case, error type INFO_FUNCTION_ALREADY_EXISTS is emitted at INFO log severity level.

Note that Octo reserves the M routine prefix ^%ydbocto for internal functions defined by Octo itself. Moreover, Octo assumes that any YottaDB extrinsic function name that includes this prefix but omits a label will have its own _ydbocto*.m file containing emulation label mappings for PostgreSQL and MySQL. Accordingly, extrinsic function names like $$^ydboctoxyz will prompt Octo to look for a _ydboctoxyz.m file containing two labels, PostgreSQL and MySQL. If these labels are absent, a LABELMISSING will be issued by YottaDB. For this reason, it is advised that users do not use the ^%ydbocto prefix in extrinsic function names to avoid conflicts and complications with Octo internal M routines.

CREATE FUNCTION can be used to define multiple functions with the same name, provided the number of parameters and/or the types of the parameters are different. In other words, CREATE FUNCTION supports function overloading.

However, functions cannot be overloaded based on their return type. For example, if two CREATE FUNCTION calls are made with the same name and parameter types, but a different return type, the return type of the last executed statement will be retained and the first discarded. Accordingly, care should be used when overloading functions, particularly when specifying varied return types for a single function.

The SQL function's parameter data types are specified in a list, while the data type of the return value must be a single value (only one object can be returned from a function). The extrinsic function name must be of the form detailed in the M Programmer's Guide.

Example with parameters (see below for the M routine that goes with these examples):

CREATE FUNCTION ADD(int, int)
RETURNS int AS $$add^myextrinsicfunction;

CREATE FUNCTION APPEND(varchar, varchar)
RETURNS varchar AS $$append^myextrinsicfunction;

To create a parameterless function, the parameter type list may be omitted by leaving the parentheses blank:

Example:

CREATE FUNCTION USERFUNC()
RETURNS int AS $$userfunc^myextrinsicfunction;

Here's the M code (in routine myextrinsicfunction.m) that goes with these functions:

myextrinsicfunction
add(x,y)
 quit x+y
append(x,y)
 quit x_y
userfunc()
 quit 42

Here are the results of running this from Octo:

OCTO> select add(5,6);
ADD
11
(1 row)
OCTO> select append('foo','boo');
APPEND
fooboo
(1 row)
OCTO> select userfunc();
USERFUNC
42
(1 row)

When a function is created from a CREATE FUNCTION statement, an entry is added to Octo's internal PostgreSQL catalog. In other words, a row is added to the pg_catalog.pg_proc system table. To view a list of created functions, their argument number and type(s), and return argument type, you can run:

select proname,pronargs,prorettype,proargtypes
from pg_proc;

Type information for each function parameter and return type will be returned as an OID. This OID can be used to look up type information, including type name, from the pg_catalog.pg_type system table. For example, to retrieve the human-readable return type and function name of all existing functions:

select proname,typname
from pg_catalog.pg_proc
inner join pg_catalog.pg_type on pg_catalog.pg_proc.prorettype = pg_catalog.pg_type.oid;

However, function parameter types are currently stored as a list in a VARCHAR string, rather than in a SQL array as the latter isn't yet supported by Octo. In the meantime, users can lookup the type name corresponding to a given type OID by using the following query:

select oid,typname
from pg_catalog.pg_type;

Error Case

Note

A CREATE FUNCTION waits for all other concurrently running queries(SELECT or CREATE TABLE or DROP TABLE) to finish so it can safely make DDL changes. It waits for an exclusive lock with a timeout of 10 seconds. If it fails due to a timeout, the user needs to stop all concurrently running queries and reattempt the CREATE FUNCTION statement.

CREATE VIEW

A view can be formed from tables, functions and other views.

CREATE VIEW [IF NOT EXISTS] view_name (column_name_list) AS view_definition;

column_name_list: (column, column, ..)

view_definition: select_query or values clause or set_operation

The column_name_list is a comma separated list of column names that is assigned to the columns defined by the view_definition. column_name_list resolves any name collision that might be present in the view definition. Name collisions in the column_name_list itself will generate an error.

A view after its creation can be used in all the clauses where a table can be used. Joins can be performed with views and non-view relations. If a view depends on a table/function/another view, DROP command cannot be applied on the relation on which the view depends on. If applied an error describing the dependency is generated.

INSERT/DELETE/UPDATE queries are not allowed with VIEWS.

Example:

CREATE VIEW v1 AS select * from names;
select * from v1;

The above example creates a view with the name v1 and has as its definition a select query which iterates through all data in the names database. The SELECT on the view above will run select * from names and provide the same output as the defining query.

Example:

CREATE VIEW v1 (v1_id, v1_firstname, v1_lastname) AS select * from names;
select * from v1;

The above example creates a view with the name v1 and columns v1_id, v1_firstname and v1_lastname. The column names specified will be column names used while displaying the result. In the above example the underlying query will have columns id, firstname and lastname. These are replaced by v1_id, v1_firstname and v1_lastname.

Example:

CREATE VIEW v1 AS values(1,'first','second');
CREATE VIEW v2 AS select 1 union select 2;

The above example demonstrates that a view can be created with VALUES clause and SET OPERATION UNION. Other SET OPERATIONS like INTERSECT and EXCEPT can also be used.

Example:

CREATE VIEW v1 AS select * from names;
CREATE VIEW v2 AS select * from v1;

The above example demonstrates that a view can be created with other views.

Example:

CREATE VIEW v1 AS select max(id) from names;

The above example demonstrates that a view can be created with functions.

Error Case

Note

A CREATE VIEW waits for all other concurrently running queries(SELECT or CREATE TABLE or DROP TABLE) to finish so it can safely make DDL changes. It waits for an exclusive lock with a timeout of 10 seconds. If it fails owing to a timeout, retry when currently running queries complete, or after stopping them.

DISCARD ALL

DISCARD ALL;

As needed, Octo automatically creates physical plans, cross references, database triggers, and other internal artifacts that allow it to execute queries correctly and quickly. The DISCARD ALL command deletes these internal artifacts. Octo also automatically discards artifacts when appropriate, for example when the schema changes or after Octo upgrades.

The DISCARD ALL command is safe to run at any time. As running a DISCARD command will cause subsequent commands to run slowly as Octo recreates required artifacts, use it when you need to minimize the size of an Octo environment, for example, to distribute it or archive it.

DISCARD XREFS

DISCARD XREFS;
DISCARD XREFS table_name;

The DISCARD XREFS command removes all metadata (cross references and statistics) where as DISCARD XREFS table_name does the same but for a specific table. Octo automatically recreates the metadata as needed. Note however, that recreating metadata causes the first query that needs it to run more slowly.

DROP TABLE

DROP TABLE [IF EXISTS] table_name [KEEPDATA];

The DROP TABLE statement is used to remove tables from the database. The keywords DROP TABLE are followed by the name of the table desired to be dropped.

If IF EXISTS is supplied for a DROP TABLE statement and a table does not exist, the result is a no-op with no errors. In this case, error type INFO_TABLE_DOES_NOT_EXIST is emitted at INFO log severity level.

Example:

DROP TABLE Employee;

By default, a DROP TABLE statement for a READWRITE table drops the table and also kills all underlying global nodes that stored the table data. The optional parameter KEEPDATA overrides this behavior, preserving the underlying global nodes regardless of table writability type. DROP TABLE statements for READONLY tables always preserve the underlying global nodes whether KEEPDATA is explicitly specified or not.

Error Case

Note

A DROP TABLE waits for all other concurrently running queries(SELECT or CREATE TABLE or DROP TABLE) to finish so it can safely make DDL changes. It waits for an exclusive lock with a timeout of 10 seconds. If it fails due to a timeout, the user needs to stop all concurrently running queries and reattempt the DROP TABLE statement.

DROP FUNCTION

DROP FUNCTION [IF EXISTS] function_name [(arg_type [, ...])];

The DROP FUNCTION statement is used to remove functions from the database. The keywords DROP FUNCTION are followed by the name of the function desired to be dropped and a list of the parameter types expected by the function. These types, if any, must be included as multiple functions may exist with the same name, but must have different parameter type lists.

If IF EXISTS is supplied for a DROP FUNCTION statement and a function does not exist, the result is a no-op with no errors. In this case, error type INFO_FUNCTION_DOES_NOT_EXIST is emitted at INFO log severity level.

Note also that the function name provided should be the name of the user-defined SQL function name, not the M label or routine name.

A function deleted using the DROP FUNCTION statement will also be removed from Octo's internal PostgreSQL catalog. In other words, the function will be removed from the pg_catalog.pg_proc system table.

The following example demonstrates two ways of dropping a function that has no parameters:

DROP FUNCTION userfunc;
DROP FUNCTION userfunc();

This example demonstrates dropping a function with parameters of types VARCHAR and INTEGER:

DROP FUNCTION userfuncwithargs (VARCHAR, INTEGER);

Error Case

Note

A DROP FUNCTION waits for all other concurrently running queries(SELECT or CREATE TABLE or DROP TABLE) to finish so it can safely make DDL changes. It waits for an exclusive lock with a timeout of 10 seconds. If it fails due to a timeout, the user needs to stop all concurrently running queries and reattempt the DROP FUNCTION statement.

DROP VIEW

DROP VIEW [IF EXISTS] view_name;

The DROP VIEW statement is used to remove views from the database. The keywords DROP VIEW are followed by the name of the view desired to be dropped.

If IF EXISTS is supplied for a DROP VIEW statement and a view does not exist, the result is a no-op with no errors. In this case, error type INFO_VIEW_DOES_NOT_EXIST is emitted at INFO log severity level.

Example:

DROP VIEW v1;

Error Case

Note

A DROP VIEW waits for all other concurrently running queries(SELECT or CREATE TABLE or DROP TABLE) to finish so it can safely make DDL changes. It waits for an exclusive lock with a timeout of 10 seconds. If it fails owing to a timeout, retry when currently running queries complete, or after stopping them.

TRUNCATE TABLE

TRUNCATE [TABLE] table_name[, ...];

The TRUNCATE TABLE statement is used to delete all row data from one or more tables in the database, leaving the table definition(s), metadata, and execution plans intact.

The keyword TRUNCATE, optionally followed by the keyword TABLE, is followed by the name of one or more tables whose rows are desired to be deleted.

Example:

TRUNCATE TABLE Employee, Customer;

This command will delete all of the rows in the Employee and Customer tables, but retain the definitions of both tables along with supporting metadata and execution plans.

SELECT

The SELECT statement is used to select rows from the database by specifying a query, and optionally sorting the resulting rows.

Table and column names may be specified as either unquoted identifiers, e.g. id or mytable, or as double-quoted identifiers, e.g. "id" or "mytable". Unquoted identifiers are case insensitive and cast internally to lowercase, while double-quoted identifiers are case sensitive. Additionally, double-quoted identifiers may contain spaces and/or SQL keywords.

Note also that Octo converts all unquoted identifiers to lower case internally, such that double-quoted identifiers referring to tables or columns created by a CREATE TABLE statement that did not specify these names using double-quoted identifiers must be in lower case in order to avoid unknown table or unknown column errors.

SELECT [ALL | DISTINCT]
[ * | expression [[AS] alias_name] [, ...]]
[FROM from_item [, ...]]
[WHERE search_condition]
[GROUP BY grouping_column [, ...]]
[HAVING search_condition]
[{UNION | INTERSECT | EXCEPT} select]
[ORDER BY sort_specification]
[LIMIT number];

ALL

The use of this clause returns all rows, which is the default behavior.

DISTINCT

The use of this clause returns only non-duplicate rows (keeping one each from the set of duplicates).

SELECT *

SELECT * is used as a shorthand for all the columns of the selected rows to be part of the output list. SELECT table_name.* is used as a shorthand for the columns coming from just the table table_name. All the columns in the table table_name are considered for processing in the order they appear.

FROM

This clause specifies the table(s) from which the columns are selected.

from_item can be any of the following:

  • table_name : The name of an existing table.

    /* Selects all rows from the table names */
    SELECT *
    FROM names;
    
  • view_name : The name of an existing view.

    /* Selects all rows generated by the view definition */
    SELECT *
    FROM v1;
    
  • alias : A temporary name given to a table or a view or a column for the purposes of a query. Please refer to the Alias section below for more information.

    /* Selects all rows from the table names aliased as n */
    SELECT *
    FROM names AS n;
    
  • select : A SELECT subquery, which must be surrounded by parentheses. Examples showcasing the usage of the SELECT subquery can be found in the Table Alias section below.

  • join_type : Any one of the JOINS. A join_type cannot be the first from_item. Examples showcasing the usage of join_type can be found in the JOINS section below.

JOINS

Joins can be made by appending a join type and table name to a SELECT statement:

[CROSS | [NATURAL | INNER | [LEFT][RIGHT][FULL] OUTER]] JOIN ON joined_table;

A CROSS JOIN between two tables provides the number of rows in the first table multiplied by the number of rows in the second table.

A NATURAL JOIN is a join operation that combines tables based on columns with the same name and type. The resultant table does not contain repeated columns.

Types of Joins:

For two tables, Table A and Table B,

  • Inner Join : Only the common rows between Table A and Table B are returned.

  • Outer Join

    • Left Outer Join : All rows from Table A are returned, along with matching rows from Table B.

    • Right Outer Join : Matching rows from Table A are returned, along with all rows from Table B.

    • Full Outer Join : All matching rows from Table A and Table B are returned, followed by rows from Table A that have no match and rows from Table B that have no match.

Example:

/* Selects the first name, last name and address of an employee that have an address. The employee and address table are joined on the employee ID values. */
SELECT FirstName, LastName, Address
FROM Employee
INNER JOIN Addresses ON Employee.ID = Addresses.EID;

Note

Currently only the INNER and OUTER JOINs support the ON clause.

WHERE

This clause represents a condition under which columns are selected. If the search_condition evaluates to true, that row is part of the output otherwise it is excluded.

GROUP BY

The GROUP BY clause provides for result rows to be grouped together based on the specified grouping_column. grouping_column can be table_name.* or SELECT list column number or an expression. In case of table_name.* all columns of the table are considered for processing.

Integers in GROUP BY can be used to refer to SELECT list columns. The starting column in the SELECT list corresponds to 1.

If a column name in GROUP BY matches both SELECT list column name and input column name (FROM list), the latter is considered for grouping.

Sub-queries are at present not allowed in GROUP BY.

Example:

/* Selects the Employee ID, first name and last name from the employee table for employees with ID greater than 100. The results are grouped by the last name of the employees. */
SELECT ID, FirstName, LastName
FROM Employee
WHERE ID > 100
GROUP BY LastName;

HAVING

The HAVING clause works to filter the rows that result from the GROUP BY clause. The rows are filtered based on the boolean value returned by the search_condition.

See Technical Notes for details on value expressions.

ORDER BY

ORDER BY lets you sort the order of the rows returned after the query.

To sort rows or columns in the database, you need to have one of the following sort_specifications.

sort_key [COLLATE collation_name] [ASC | DESC];

The sort_key can be a column reference, expression, literal or the shorthand table_name.*.

The sort key can be followed by a collate clause, ordering specification or both.

Note

A collation is a set of rules to compare characters in a character set.

The collate clause consists of the word COLLATE and the relevant collation name.

The ordering specification lets you further choose whether to order the returned columns in ascending (ASC) or descending (DESC) order.

Example:

/* Selects the Employee ID, first name and last name from the employee table for employees with ID greater than 100. The results are ordered in descending order of ID. */
SELECT ID, FirstName, LastName
FROM Employee
WHERE ID > 100
ORDER BY ID DESC;

Integers in ORDER BY can be used to refer to SELECT list columns. The starting column in the SELECT list corresponds to 1.

If a column name in ORDER BY matches both SELECT list column name and input column name (FROM list), the former is considered for ordering.

Error Case

If a column name in ORDER BY matches a user specified alias in SELECT list and a column reference in SELECT list, an ambiguity error is issued.

SELECT 'Zero' != 'Zero' AS firstname,firstname FROM names ORDER BY firstname;
[ERROR]: ERR_AMBIGUOUS_COLUMN_NAME: Ambiguous column name 'FIRSTNAME': qualify name for safe execution

LIMIT

This clause allows the user to specify the number of rows they want to retrieve from the results of the query.

Example:

/* Selects the first five rows from the employee table */
SELECT *
FROM Employee
LIMIT 5;

The above example returns no more than five rows.

Queries without rows

SELECT can also be used to calculate values, without needing to select from a table.

Example:

SELECT (1 * 2) + 3;

INSERT

INSERT INTO table_name ( column name [, column name ...]) [ VALUES ... | (SELECT ...)];

The INSERT statement allows you to insert values into a table. These can either be provided values or values specified as a result of a SELECT statement. INSERT enforces PRIMARY KEY and CHECK constraints.

Example:

INSERT INTO Employee (ID , FirstName, LastName) VALUES (220, 'Jon', 'Doe'), (383, 'Another', 'Name');

UPDATE

UPDATE table_name [[AS] alias_name] SET column1 = expression [, column2 = expression ...] [WHERE search_condition];

The UPDATE statement allows you to change existing records in the table. table_name specifies the name of the table to be updated followed by a list of comma-separated statements that are used to update existing columns in the table with specified values. Only those columns in table_name that require change need to be mentioned in the SET clause. The remaining columns retain their previous values. The optional WHERE condition allows you to update columns only on those rows of the table that satisfy the specified search_condition. UPDATE enforces PRIMARY KEY and CHECK constraints.

Example:

UPDATE Employee SET FirstName = 'John' WHERE ID = 220;

DELETE

DELETE FROM table_name [[AS] alias_name] [WHERE search_condition];

The DELETE statement consists of the keywords DELETE FROM followed by the name of the table and possibly a search condition.

The search condition eventually yields a boolean true or false value, and may contain further search modifications detailing where to apply the search_condition and how to compare the resulting values.

Example:

DELETE FROM Employee WHERE ID = 220;

SET

(Partially supported.)

SET runtime_parameter = value;

The SET command changes the value of a run-time configuration parameter. Presently, Octo does not honor such parameter settings itself, but merely provides the SET interface for compatibility with PostgreSQL clients. Note that run-time parameter names are case-insensitive when using SET.

Example:

SET DateStyle = 'ISO';

Runtime parameter information is maintained in the pg_catalog.pg_settings PostgreSQL catalog table. Using a SET command to change the value of a run-time parameter will also update the entry for that parameter in pg_catalog.pg_settings.

Note that SET commands treat SQL NULL values as empty strings. For example, the following command sets the DateStyle parameter to the empty string:

SET DateStyle = NULL;

Note that updates to pg_catalog.pg_settings using the INSERT INTO, DELETE FROM or UPDATE commands are disallowed (would issue a ERR_TABLE_READONLY error).

SHOW

SHOW runtime_parameter;

The SHOW command prints the value of a run-time configuration parameter. Note that run-time parameter names are case-insensitive when using SHOW.

Example:

SHOW DateStyle;

Runtime parameter information is maintained in the pg_catalog.pg_settings PostgreSQL catalog table. Accordingly, run-time parameter information may be viewed by querying this table. When using this method, the parameter name is case-sensitive, as the name will be looked up by comparing the given literal value against a canonical name in the database.

Example:

SELECT name, setting FROM pg_catalog.pg_settings WHERE name = 'DateStyle';

To list of all run-time parameter information:

SELECT * FROM pg_catalog.pg_settings;

Set Operations

These are operations that work on the results of two or more queries.

The conditions are:

  • The data types in the results of each query need to be compatible.

  • The order and number of the columns in each result set need to be the same.

UNION

SELECT [.....]
FROM table_name[...]
UNION
[ALL] SELECT [.....]
FROM table_name2[...]....;

The UNION operation consists of two or more queries joined together with the word UNION. It combines the results of two individual queries into a single set of results.

The keyword ALL ensures that duplicate rows of results are not removed during the UNION.

Example:

SELECT FirstName
FROM Employee
UNION
SELECT FirstName
FROM AddressBook;

INTERSECT

SELECT [.....]
FROM table_name[......]
INTERSECT
[ALL] SELECT [.....]
FROM table_name2[....]......;

The INTERSECT operation consists of two or more queries joined together with the word INTERSECT. It returns distinct non-duplicate results that are returned by both queries on either side of the operation.

The keyword ALL ensures that duplicate rows of results returned by both queries are not eliminated during the INTERSECT.

SELECT ID
FROM Employee
INTERSECT
SELECT ID
FROM AddressBook;

EXCEPT

SELECT [.....]
FROM table_name[.....]
EXCEPT
[ALL] SELECT [.....]
FROM table_name2[......].......;

The EXCEPT operation consists of two or more queries joined together with the word EXCEPT. It returns (non-duplicate) results from the query on the left side except those that are also part of the results from the query on the right side.

The keyword ALL affects the resulting rows such that duplicate results are allowed but rows in the first table are eliminated if there is a corresponding row in the second table.

SELECT LastName
FROM Employee
EXCEPT
SELECT LastName
FROM AddressBook;

VALUES

VALUES provides a way to generate an "on-the-fly" table that can be used in a query without having to actually create and populate a table on-disk.

The syntax is:

VALUES ( expression [, ...] ) [, ...]

Each parenthesized list of expressions generates one row in the table. Each specified row must have the same number of comma-separated entries (could be constants, expressions, subqueries etc.). This becomes the number of columns in the generated table. Corresponding entries in each row must have compatible data types. The data type assigned to each column of the generated table is determined based on the data type of the entries in the row lists.

The columns of the generated table are assigned the names column1, column2, etc.

For example, the below generates a table of two columns and three rows.

VALUES (1, 'one'), (2, 'two'), (3, 'three');

will return a table containing two columns (named column1 with type INTEGER and column2 with type VARCHAR) and three rows.

VALUES followed by expression lists can appear anywhere a SELECT can. So, the two queries below are equivalent.

VALUES (1, 'one'), (2, 'two'), (3, 'three');
SELECT 1, 'one' UNION SELECT 2, 'two' UNION SELECT 3, 'three';

There is an exception to this currently, ORDER BY and LIMIT cannot be specified at the end of VALUES like they can be for SELECT.

Below are examples of using VALUES with entries containing expressions and subqueries:

SELECT 5 + (VALUES (3));
SELECT * FROM (VALUES ((SELECT 1), 2));
VALUES((SELECT id FROM names WHERE id > 5));

CASE

Octo supports two different formats of the CASE statement.

CASE value_expression
WHEN value_1 THEN result_1
WHEN value_2 THEN result_2
[WHEN ... ]
[ELSE result_n]
END

This form of the CASE statement evaluates the value_expression and sequentially compares that to each of the values following WHEN. Upon finding a match it returns the corresponding "result" following THEN. If no match is found then the "result" following ELSE is returned, or NULL is returned if ELSE has been omitted.

CASE WHEN condition_expression_1 THEN result_1
     WHEN condition_expression_2 THEN result_2
     [WHEN ... ]
     [ELSE result_n]
END

The second form of the CASE statement sequentially tests each condition_expression. If a condition_expression evaluates to TRUE, the "result" following THEN is returned. If all conditions evaluate to FALSE the "result" following ELSE is returned, or NULL is returned if ELSE has been omitted.

Functions

Octo supports the following built-in functions. Each of these functions comes pre-defined with Octo, and can be used straightaway without the need for the user to define them.

Note that function prototypes that appear both with and without parentheses indicate that the given function may be called both with and without parentheses. For example, CURRENT_CATALOG() may be called as either CURRENT_CATALOG() or CURRENT_CATALOG.

Function names may be specified as either unquoted identifiers, e.g. abs or concat, or as double-quoted identifiers, e.g. "abs" or "concat". Unquoted identifiers are case insensitive and will be internally cast to lowercase, while double-quoted identifiers are case sensitive. Additionally, double-quoted identifiers may contain spaces and/or SQL keywords.

Note that when calling functions using double-quoted identifiers, only the function name should be double quoted and not the parentheses or arguments.

ABS

ABS(NUMERIC)

ABS returns the absolute value of a number.

COALESCE

COALESCE(value_expression [, value_expression...])

The built-in COALESCE function returns the first of its arguments that is not NULL. If all arguments are NULL, NULL is returned. COALESCE must have at least one argument.

The arguments passed to COALESCE all have to be of the same type. For example, the following query is valid and returns the value 'a':

SELECT COALESCE(NULL, 'a', 'b');

CONCAT

CONCAT(VARCHAR, VARCHAR)
CONCAT(VARCHAR, VARCHAR, VARCHAR)

The built-in CONCAT function returns the concatenation of its arguments as a VARCHAR value. This function may be used with two or three VARCHAR arguments to be concatenated.

SELECT CONCAT('string1', 'string2')
SELECT CONCAT('string1', 'string2', 'string3')

Date and Time type arguments are also accepted by this function as long as one of the arguments is a VARCHAR. The result will always have date/time value in text format.

OCTO> SELECT CONCAT(DATE'2023-01-01', 'sample text');
concat
2023-01-01sample text
(1 row)

OCTO> SELECT CONCAT(DATE(FILEMAN)'3230101', 'sample text');
concat
2023-01-01sample text
(1 row)

CURRENT_CATALOG

CURRENT_CATALOG
CURRENT_CATALOG()

The built-in CURRENT_CATALOG function returns the name of the current database catalog. However, since Octo currently does not support the use of more than one database catalog, this function always returns "octo".

CURRENT_DATABASE

CURRENT_DATABASE()

The built-in CURRENT_DATABASE function returns the name of the current database. However, since Octo currently does not support the use of more than one database, this function always returns "octo".

CURRENT_ROLE

CURRENT_ROLE
CURRENT_ROLE()

The built-in CURRENT_ROLE function returns the name of the current user role. However, since Octo currently does not support user roles, this function is an alias for CURRENT_USER().

CURRENT_SCHEMA

CURRENT_SCHEMA
CURRENT_SCHEMA()

The built-in CURRENT_SCHEMA function returns the name of the current database schema. However, since Octo currently does not multiple schemas, this function will always return "public".

CURRENT_TIME

CURRENT_TIME
CURRENT_TIME()

The built-in CURRENT_TIME returns the current system time in the following formats, depending on which database emulation setting is active:

  • POSTGRES emulation: hh:mm:ss.UUUUUU[-|+]LL, where U is a microsecond field and [-|+]LL is the positive or negative UTC offset.

  • MYSQL emulation: hh:mm:ss

CURRENT_TIMESTAMP

CURRENT_TIMESTAMP
CURRENT_TIMESTAMP()

The built-in CURRENT_TIMESTAMP is a synonym for the NOW function, and returns the current system time in the following formats, depending on which database emulation setting is active:

  • POSTGRES emulation: YYYY-MM-DD hh:mm:ss.uuuuuu[-|+]LL, where u is a microsecond field and [-|+]LL is the positive or negative UTC offset.

  • MYSQL emulation: YYYY-MM-DD hh:mm:ss

CURRENT_USER

CURRENT_USER
CURRENT_USER()

The built-in CURRENT_USER function returns the username of the current Rocto user. Returns an empty string in Octo, since Octo does not implement SQL user authentication and does not distinguish between users.

Note that CURRENT_USER() is a synonym for USER().

DAY

DAY(VARCHAR)
DAY(DATE)

The built-in DAY function is a synonym for DAYOFMONTH, and accepts a date in the format YYYY-MM-DD as a string (DAY('2023-01-01')) and returns the numeric day of the month in the range 0-31 where 0 is returned for dates that have a value of zero for the day field, e.g. 0000-00-00. Also, DATE type values are also accepted with datestyle set to YMD (DAY(date'2023-01-01'),:code:DAY(date(fileman)'3230101')). When DATE type value is passed the function returns the numeric day of the month in the range 1-31.

DAYOFMONTH

DAYOFMONTH(VARCHAR)
DAYOFMONTH(DATE)

The built-in DAYOFMONTH function accepts a date in the format YYYY-MM-DD as a string (DAYOFMONTH('2023-01-01')) and returns the numeric day of the month in the range 0-31 where 0 is returned for dates that have a value of zero for the day field, e.g. 1999-06-00. Also, DATE type values are also accepted with datestyle set to YMD (DAYOFMONTH(date'2023-01-01'),:code:DAYOFMONTH(date(fileman)'3230101')). When DATE type value is passed the function returns the numeric day of the month in the range 1-31.

DATE_FORMAT

DATE_FORMAT(VARCHAR,VARCHAR)
DATE_FORMAT(TIMESTAMP,VARCHAR)

The built-in DATE_FORMAT function accepts a date in the format YYYY-MM-DD hh:mm:ss.uuuuuu and a format string, and returns a new string wherein the given date is formatted according to the format specified. Note that the number of microseconds uuuuuu may be omitted such that the date may be in the format YYYY-MM-DD hh:mm:ss.

Note that in the following table there is reference to MySQL WEEK() modes. Presently, Octo does not implement WEEK(), but the MySQL WEEK() modes are implemented for those format codes below that require them. For more information on WEEK() modes, see the MySQL documentation.

Acceptable formatting symbols for DATE_FORMAT format string are as follows:

Format symbol

Description

%a

Abbreviated weekday name (Sun..Sat)

%b

Abbreviated month name (Jan..Dec)

%c

Month, numeric (0..12)

%D

Day of the month with English suffix (0th, 1st, 2nd, 3rd, …)

%d

Day of the month, numeric (00..31)

%e

Day of the month, numeric (0..31)

%f

Microseconds (000000..999999)

%H

Hour (00..23)

%h

Hour (01..12)

%I

Hour (01..12)

%i

Minutes, numeric (00..59)

%j

Day of year (001..366)

%k

Hour (0..23)

%l

Hour (1..12)

%M

Month name (January..December)

%m

Month, numeric (00..12)

%p

AM or PM

%r

Time, 12-hour (hh:mm:ss followed by AM or PM)

%S

Seconds (00..59)

%s

Seconds (00..59)

%T

Time, 24-hour (hh:mm:ss)

%U

Week (00..53), where Sunday is the first day of the week; Corresponding to MySQL WEEK() mode 0

%u

Week (00..53), where Monday is the first day of the week; Corresponding to MySQL WEEK() mode 1

%V

Week (01..53), where Sunday is the first day of the week; Corresponding to MySQL WEEK() mode 2; used with %X

%v

Week (01..53), where Monday is the first day of the week; Corresponding to MySQL WEEK() mode 3; used with %x

%W

Weekday name (Sunday..Saturday)

%w

Day of the week (0=Sunday..6=Saturday)

%X

Year for the week where Sunday is the first day of the week, numeric, four digits; used with %V

%x

Year for the week, where Monday is the first day of the week, numeric, four digits; used with %v

%Y

Year, numeric, four digits

%y

Year, numeric (two digits)

%%

A literal % character

%x

x, for any "x" not listed above

OCTO> SELECT DATE_FORMAT('2004-10-22 21:20:14', '%W %M %Y');
DATE_FORMAT
Friday October 2004
(1 row)
OCTO> SELECT DATE_FORMAT('2019-10-22 21:20:14', '%H:%i:%s');
DATE_FORMAT
21:20:14
(1 row)
OCTO> SELECT DATE_FORMAT('1920-10-22 21:20:14', '%D %y %a %d %m %b %j');
DATE_FORMAT
22nd 20 Fri 22 10 Oct 296
(1 row)
OCTO> SELECT DATE_FORMAT('1994-10-22 21:20:14', '%H %k %I %r %T %S %w');
DATE_FORMAT
21 21 09 09:20:14 PM 21:20:14 14 6
(1 row)
OCTO> SELECT DATE_FORMAT('1999-01-01', '%X %V');
DATE_FORMAT
1998 52
(1 row)
OCTO> SELECT DATE_FORMAT('2006-06-00', '%d');
DATE_FORMAT
00
(1 row)

GREATEST and LEAST

GREATEST(value_expression [, value_expression...])
LEAST(value_expression [, value_expression...])

The built-in GREATEST function returns the largest value from a list of expressions. Similarly, LEAST returns the smallest value. NULL values are ignored, unless all values are NULL, in which case the return value is NULL. All arguments must have the same type.

HAS_DATABASE_PRIVILEGE

HAS_DATABASE_PRIVILEGE(username, databasename, privilege)

The built-in HAS_DATABASE_PRIVILEGE function returns true if the user (first argument) of the specified database (second argument) has the specified privilege (third argument). However, since Octo currently does not implement privileges, this function will always return true (1).

LOCALTIME

LOCALTIME
LOCALTIME()

The built-in LOCALTIME function returns the current system time in the following formats, depending on which database emulation setting is active:

  • POSTGRES emulation: hh:mm:ss.UUUUUU[-|+]LL, where U is a microsecond field and [-|+]LL is the positive or negative UTC offset.

  • MYSQL emulation (synonym for NOW): YYYY-MM-DD hh:mm:ss

LOCALTIMESTAMP

LOCALTIMESTAMP
LOCALTIMESTAMP()

The built-in LOCALTIMESTAMP is a synonym for the NOW function, and returns the current system time in the following formats, depending on which database emulation setting is active:

  • POSTGRES emulation: YYYY-MM-DD hh:mm:ss.UUUUUU[-|+]LL, where U is a microsecond field and [-|+]LL is the positive or negative UTC offset.

  • MYSQL emulation: YYYY-MM-DD hh:mm:ss

LPAD

LPAD(VARCHAR, INTEGER)
LPAD(VARCHAR, INTEGER, VARCHAR)

The built-in LPAD function adds padding to the left hand side of a string (first argument) up to the designated length (second argument). The default padding is a space, which is used in the two-argument form of this function. However, an optional third argument specifying a specific string to use for padding may also be used.

Note that in POSTGRES emulation either the two- or three- argument form may be used. However, MySQL only supports the three-argument version, so a third argument must always be specified when using the MYSQL emulation setting.

NOW

NOW()

The built-in NOW function returns the current system time in the following formats, depending on which database emulation setting is active:

  • POSTGRES emulation: YYYY-MM-DD hh:mm:ss.UUUUUU[-|+]LL, where U is a microsecond field and [-|+]LL is the positive or negative UTC offset.

  • MYSQL emulation: YYYY-MM-DD hh:mm:ss

Note that NOW is a synonym for CURRENT_TIMESTAMP, but, unlike the latter function, it must always include parentheses.

NULLIF

NULLIF(value_expression, value_expression)

The built-in NULLIF function returns NULL if both arguments are equal, or the first argument otherwise. The arguments must have the same type.

PG_ENCODING_TO_CHAR

PG_ENCODING_TO_CHAR(INTEGER)

The built-in PG_ENCODING_TO_CHAR function converts the value of the current character encoding setting from INTEGER representation to VARCHAR.

Since PostgreSQL encodings are not fully supported by Octo, this function will always return SQL_ASCII.

Note

This function is only partially implemented and returns a fixed value regardless of input. It is partially implemented solely to avoid syntax errors during SQL client startup.

PG_IS_IN_RECOVERY

PG_IS_IN_RECOVERY()

The built-in PG_IS_IN_RECOVERY function returns true if the database is in the process of recovering from a failure by restoring a backup. Since Octo doesn't currently support this feature, this function always returns false (0).

Note

This function is only partially implemented and returns a fixed value regardless of input. It is partially implemented solely to avoid syntax errors during SQL client startup.

PG_IS_XLOG_REPLAY_PAUSED

PG_IS_XLOG_REPLAY_PAUSED()

The built-in PG_IS_XLOG_REPLAY_PAUSED function returns true if the database has paused the process of recovering from a failure by restoring a backup. Since Octo doesn't currently support this feature, this function always returns false (0).

Note

This function is only partially implemented and returns a fixed value regardless of input. It is partially implemented solely to avoid syntax errors during SQL client startup.

REPLACE

REPLACE(VARCHAR, VARCHAR, VARCHAR)

This is a standard SQL replace function which does substring replacement in the given input. The 1st argument is the source string in which the 2nd argument, which is the substring is searched and replaced with the string given in the 3rd argument.

Note

  • When an empty string is passed as 1st argument, the result will be NULL irrespective of other arguments

  • When an empty string is passed as 2nd or 3rd argument it is treated as an empty string

  • When NULL is passed as 1st argument, the result will be NULL irrespective of other arguments

  • When NULL is passed as 2nd or 3rd argument it is treated as an empty string

ROUND

ROUND(NUMERIC, INTEGER)

ROUND returns the first argument rounded to the precision specified by the second argument. If the precision is greater than zero, the number will be rounded to that number of decimal places. If the precision is zero, it will be rounded to the nearest integer. If the precision is less than zero, all fractional digits will be truncated and the number will be rounded to 10^precision. The precision must be no less than -46.

SESSION_USER

SESSION_USER
SESSION_USER()

The built-in SESSION_USER function returns the name of the current session user. However, since Octo currently does not support session users, this function is an alias for CURRENT_USER.

TRUNC/TRUNCATE

TRUNC(NUMERIC, INTEGER)
TRUNC(NUMERIC, NUMERIC)
TRUNC(INTEGER, NUMERIC)
TRUNC(INTEGER, INTEGER)
TRUNCATE(NUMERIC, INTEGER)
TRUNCATE(NUMERIC, NUMERIC)
TRUNCATE(INTEGER, NUMERIC)
TRUNCATE(INTEGER, INTEGER)

TRUNC (or TRUNCATE) returns the first argument truncated to the precision specified by the second argument. If the precision is greater than zero, the number will be truncated to that number of decimal places. If the precision is zero, this behaves the same as the mathematical floor function. If the precision is less than zero, all fractional digits will be truncated and the number will be truncated to 10^precision. The precision must be no less than -43.

USER

USER
USER()

The built-in USER function returns the username of the current Rocto user. Returns an empty string in Octo, since Octo does not implement SQL user authentication and does not distinguish between users.

Note that USER() is a synonym for CURRENT_USER().

Aggregate Functions

Aggregate functions operate on a set of values in "aggregate" to yield a single new value.

The set of values passed to each aggregate function call may be preceded by the ALL or DISTINCT keyword. ALL will cause the aggregregate function to process all passed values. DISTINCT will cause it to process all unique values, omitting duplicates.

AVG

AVG([ALL | DISTINCT] value_expression)

The AVG aggregate function returns the average of all the non-NULL values passed to it.

COUNT

COUNT(*)
COUNT([ALL | DISTINCT] value_expression)

The COUNT aggregate function returns the number of rows belonging to a table.

COUNT(*) returns the number of rows belonging to the table referenced in the FROM clause of the given query.

If a subquery is passed to COUNT, then it returns the number of result rows that would be returned by the subquery.

If a column name is passed to COUNT, then it returns the number of non-NULL result rows for the given column.

MAX

MAX([ALL | DISTINCT] value_expression)

The MAX aggregate function returns the maximum of all the non-NULL values passed to it.

MIN

MIN([ALL | DISTINCT] value_expression)

The MIN aggregate function returns the maximum of all the non-NULL values passed to it.

SUM

SUM(INT)
SUM(NUMERIC)

The SUM aggregate function returns the sum of all the non-NULL values passed to it.

Constructors

ARRAY

ARRAY(single_column_subquery)

The ARRAY constructor can be used to generate a single-dimensional array from the results of a subquery, with each result row value occupying one element of the array. The subquery must return only one column.

Note

The array data type is not currently supported and the constructed array is in fact treated as a string in Octo. As a result, multi-dimensional arrays cannot be constructed using this syntax. Similarly, syntax and functions that rely on the array data type are also unsupported.

Operators

The arithmetic operators in Octo are:

  • ADDITION +

  • SUBTRACTION -

  • MULTIPLICATION *

  • DIVISION /

  • MODULO %

The string operator in Octo is:

  • Concatenation ||

The comparative operators in Octo are:

  • EQUALS =

  • NOT EQUALS <>

  • LESS THAN <

  • GREATER THAN >

  • LESS THAN OR EQUALS <=

  • GREATER THAN OR EQUALS >=

The logical operators in Octo are:

  • AND : The record will be displayed if all the conditions are TRUE

  • OR : The record will be displayed if any of the conditions is TRUE

  • NOT : The record will be displayed if the condition(s) is NOT TRUE

Other operators in Octo:

  • BETWEEN : This operator selects values within a given range, begin and end values included.

  • EXISTS : The result is TRUE if the evaluated subquery returns at least one row. It is FALSE if the evaluated subquery returns no rows.

  • ANY/SOME : The result is TRUE if any true result is obtained when the expression is evaluated and compared to each row of the subquery result. It is FALSE if no true result is found or if the subquery returns no rows.

Alias

Double quotes and non-quoted identifiers can be used to represent alias names. Note, however, that double-quoted identifiers are case sensitive, while unquoted identifiers are not. Additionally, double-quoted identifiers may contain spaces and/or SQL keywords.

Column Alias

A column alias can be used in two different ways:

  1. As part of SELECT

    SELECT column [AS] column_alias
    FROM from_item;
    

    Examples:

    OCTO> select firstname as "quoted" from names limit 1;
    QUOTED
    Zero
    
    OCTO> select firstname as 'quoted' from names limit 1;
    QUOTED
    Zero
    
    OCTO> select firstname as ida from names limit 1;
    IDA
    Zero
    
    OCTO> select ida from (select 8 as "ida") n1;
    IDA
    8
    
    OCTO> select ida from (select 8 as 'ida') n1;
    IDA
    8
    
    OCTO> select ida from (select 8 as ida) n1;
    IDA
    8
    
    OCTO> select ida from (select 8 as ida) as n1;
    IDA
    8
    

    Column aliases are supported in short form i.e without AS keyword

    OCTO> select ida from (select 8 ida) n1;
    IDA
    8
    
  2. As part of FROM

    SELECT [ALL | DISTINCT]
    [* | expression]
    FROM table_name [AS] table_alias(column_alias [, ...]);
    

    Examples:

    OCTO> SELECT * FROM names AS tblalias(colalias1, colalias2, colalias3) WHERE tblalias.colalias1 = 1;
    COLALIAS1|COLALIAS2|COLALIAS3
    1|Acid|Burn
    

Table Alias

Usage:

[table_name | subquery] [AS] aliasname

Examples:

OCTO> select n1.firstname from names as "n1" limit 1;
FIRSTNAME
Zero

OCTO> select n1.firstname from names as 'n1' limit 1;
FIRSTNAME
Zero

OCTO> select n1.firstname from names as n1 limit 1;
FIRSTNAME
Zero

OCTO> select 1 as output from names as n1 inner join (select n2.id from names as n2 LIMIT 3) as alias2 ON (n1.id = alias2.id );
OUTPUT
1
1
1

/* The select subquery uses aliases for the table as well as columns. This query selects one row from the names table aliased as tblalias, where the value of the colalias1 is one(1). */
OCTO> SELECT * FROM (SELECT * FROM names) as tblalias(colalias1, colalias2, colalias3) WHERE tblalias.colalias1 = 1;
COLALIAS1|COLALIAS2|COLALIAS3
1|Acid|Burn

Table aliases are supported in short form i.e without AS

OCTO> select n1.firstname from names "n1" limit 1;
FIRSTNAME
Zero

Note

  • If single quotes or double quotes are used, keywords like NULL, AS etc can be used as alias name

  • Aliasing with quoted multi words, containing spaces, are supported. But their usage as a reference (column or table) is not yet supported

    For example:

    Supported:

    select id as "id a" from names;

    select id from names as "n one";

    select id "id a" from names;

    select id from names "n one";

    Not Supported:

    select "id a" from (select 8 as "id a") n1; -> (column name with spaces)

    select 1 from names as n1 inner join (select n2.id from names as n2 LIMIT 3) as "alias two" ON (n1.id = "alias two".id); -> (table name with spaces)

  • Multi word aliases i.e with spaces can only be formed with single or double quotes

    For example:

    Supported:

    column [AS] "word word"

    column [AS] 'word word'

    [table_name | subquery] [AS] "word word"

    [table_name | subquery] [AS] 'word word'

    Not supported:

    column [AS] word word

    [table_name | subquery] [AS] word word

Pattern Processing

LIKE

string LIKE pattern

If the pattern matches the string, LIKE operation returns true.

Pattern is expected to match the entire string i.e.

'a'  LIKE 'a' -> TRUE
'ab' LIKE 'a' -> FALSE

% and _ have a special meaning. % matches any string of zero or more characters and _ matches any single chracter.

'abcd' LIKE '%'    -> TRUE
'abcd' LIKE 'ab%'  -> TRUE
'cdcd' LIKE 'ab%'  -> FALSE
'abcd' LIKE 'a_cd' -> TRUE
'ebcd' LIKE 'a_cd' -> FALSE

Escaping % or _ will take away its special meaning, and, it will just match % and _ in its literal form.

'ab%ab' LIKE 'ab\%ab' -> TRUE
'abab'  LIKE 'ab\%ab' -> FALSE
'ab_ab' LIKE 'ab\_ab' -> TRUE
'abab'  LIKE 'ab\_ab' -> FALSE

To match an escape as itself additional escape is required. Any other character if escaped has no special meaning. It will match its literal self.

'ab\ab' LIKE 'ab\\ab' -> TRUE
'ab\ab' LIKE 'ab\ab'  -> FALSE
'abab'  LIKE 'ab\ab'  -> TRUE

Any other character is matched without any special meaning.

'ab*&$#' LIKE 'ab*&$#' -> TRUE
'ab*&$#' LIKE 'ab*'    -> FALSE

Variations of LIKE

  1. ~~ : Same as LIKE

  2. ILIKE : Case insensitive version of LIKE

    'abc' ILIKE 'Abc' -> TRUE
    'abc' LIKE  'Abc' -> FALSE
    
  3. ~~* : Case insensitive version of LIKE

  4. NOT LIKE : Negated version of LIKE

    'abc' LIKE 'abc'      -> TRUE
    'abc' LIKE 'cba'      -> FALSE
    'abc' LIKE '%'        -> TRUE
    'abc' NOT LIKE 'abc'  -> FALSE
    'abc' NOT LIKE 'cba'  -> TRUE
    'abc' NOT LIKE '%'    -> FALSE
    
  5. !~~ : Negated version of LIKE

  6. NOT ILIKE : Negated version of case insensitive LIKE

  7. !~~* : Negated version of case insensitive LIKE

Error Case

LIKE pattern cannot end with an escape character. This results in an error.

'abc' LIKE 'abc\'
[ERROR]: ERR_INVALID_ESCAPE_PATTERN: Cannot end pattern with escape character: abc\

'abc\' LIKE 'abc\\' -> TRUE

SIMILAR TO

string SIMILAR TO pattern

If the pattern matches the string, SIMILAR TO operation returns true.

Pattern is expected to match the entire string i.e.

'a'  SIMILAR TO 'a' -> TRUE
'ab' SIMILAR TO 'a' -> FALSE

As seen in the LIKE operation, following characters have special meaning:

  • % matches any string of zero or more characters

  • _ matches any single character

  • Escaping % or _ will take away its special meaning, and, it will just match % or _ in its literal form

  • To match an escape as itself additional escape is required

Additionally, the following characters also having special meaning:

  • | : The whole string should match a unit on either side of |

    'abd' SIMILAR TO 'abc|d'       -> TRUE ( Here along with other characters, the right side of | which is 'd' is matched )
    'dba' SIMILAR TO '(abc)|(dba)' -> TRUE ( Here the right side of | which is (dba) is matched )
    
  • * : Match a sequence of zero or more units

    'wow'         SIMILAR TO 'woo*w'    -> TRUE
    'wooow'       SIMILAR TO 'woo*w'    -> TRUE
    'dabcabcabcd' SIMILAR TO 'd(abc)*d' -> TRUE
    'dd'          SIMILAR TO 'd(abc)*d' -> TRUE
    
  • + : Match a sequence of one or more units

    'dabcabcd' SIMILAR TO 'd(abc)+d'  -> TRUE
    'dd'       SIMILAR TO 'd(abc)+d'  -> FALSE
    
  • ( ) : Groups contained items into a single logical unit

  • [ ] : Matches any one of the characters mentioned inside the brackets

    'a' SIMILAR TO '[abc]' -> TRUE
    'c' SIMILAR TO '[abc]' -> TRUE
    'd' SIMILAR TO '[abc]' -> FALSE
    
  • { }

    • {m} : Match a sequence of exactly m units

      'aaaa' SIMILAR TO 'a{4}' -> TRUE
      'aaa'  SIMILAR TO 'a{4}' -> FALSE
      
    • {m,} : Match a sequence of m or more units

      'aaaaa'  SIMILAR TO 'a{2,}'      -> TRUE
      'a'      SIMILAR TO 'a{2,}'      -> FALSE
      'ababab' SIMILAR TO '(ab){2,}'   -> TRUE
      'ab'     SIMILAR TO '(ab){2,}'   -> FALSE
      
    • {m,n} : Match a sequence of exactly m through n (inclusive) units

      'aaa' SIMILAR TO 'a{1,3}'   -> TRUE
      'aa'  SIMILAR TO 'a{1,3}'   -> FALSE
      
  • ? : Match zero or one unit

    'abc'  SIMILAR TO 'ab?c'    -> TRUE
    'ac'   SIMILAR TO 'ab?c'    -> TRUE
    'abbc' SIMILAR TO 'ab?c'    -> FALSE
    'azyc' SIMILAR TO 'a(zy)?c' -> TRUE
    'ac'   SIMILAR TO 'a(zy)?c' -> TRUE
    'azc'  SIMILAR TO 'a(zy)?c' -> FALSE
    

Note

  • A unit refers to a logical grouping done using ( ) or a character depending on its usage

    For example:

    'ababab' SIMILAR TO '(ab)+' -> TRUE ( Here ab is the logical unit considered by + )

    'abbb' SIMILAR TO 'ab+' -> TRUE ( Here b is the logical unit considered by + )

  • Similar to the LIKE operation, if the above characters are escaped they lose their special meaning

Variation of SIMILAR TO

  1. NOT SIMILAR TO : Negated version of SIMILAR TO

    'abc' SIMILAR TO     'abc'   -> TRUE
    'abc' NOT SIMILAR TO 'abc'   -> FALSE
    

TILDE ~

string ~ pattern

If the pattern matches the string, ~ operation returns true.

Partial match of the pattern is valid, i.e.

'a'  ~ 'a'          -> TRUE
'ab' ~ 'a'          -> TRUE  (Partial match is valid)
'ab' SIMILAR TO 'a' -> FALSE (Partial match is not valid)
'ab' LIKE 'a'       -> FALSE (Partial match is not valid)

% and _ have no special meaning. They are matched as literals.

To match an escape as itself additional escape is required.

The following characters have special meaning:

  • . : Matches any single character

    'abc' ~ '...' -> TRUE
    
  • * : Match a sequence of zero or more units

    'aab' ~ 'a*'  -> TRUE
    'baa' ~ 'a*'  -> TRUE
    
  • | : Match a unit on either side of |

    'abd' LIKE       'abc|d'       -> FALSE ( | does not have special meaning for LIKE operation )
    'abd' SIMILAR TO 'abc|d'       -> FALSE ( | expects 'abd' to match either 'abc' or 'd' . But, as 'abd' is not either of those, the result is FALSE )
    'abd' ~          'abc|d'       -> TRUE  ( | expects 'abd' to match either 'abc' or 'abd'. Hence the result is TRUE )
    
  • + : Match a sequence of one or more units

    'dabcabcd' ~ '(abc)+'  -> TRUE
    'dd'       ~ '(xyz)+'  -> FALSE
    'dd'       ~ 'd+'      -> TRUE
    'a'        ~ 'd+'      -> FALSE
    
  • ( ) : Groups contained items into a single logical unit

  • [ ] : Matches any one of the characters mentioned inside the brackets

    'a'   ~ '[abc]' -> TRUE
    'zay' ~ '[abc]' -> TRUE
    'zy'  ~ '[abc]' -> FALSE
    
  • { }

    • {m} : Match a sequence of exactly m units

      'yyaaaabcc' ~ 'a{4}' -> TRUE
      'yyaaabcc'  ~ 'a{4}' -> FALSE
      
    • {m,} : Match a sequence of m or more units

      'yyaaabcc'     ~ 'a{2,}'      -> TRUE
      'yyabcc'       ~ 'a{2,}'      -> FALSE
      'yyabaaababcc' ~ '(ab){2,}'   -> TRUE
      'yyabcc'       ~ '(ab){2,}'   -> FALSE
      
    • {m,n} : Match a sequence of exactly m through n (inclusive) units

      'aaa' ~ 'a{1,3}'   -> TRUE
      'aa'  ~ 'a{1,3}'   -> FALSE
      
  • ? : Match zero or one unit

'abcd'  ~ 'ab?c'    -> TRUE
'acd'   ~ 'ab?c'    -> TRUE
'abbcd' ~ 'ab?c'    -> FALSE
'azycd' ~ 'a(zy)?c' -> TRUE
'acd'   ~ 'a(zy)?c' -> TRUE
'azcd'  ~ 'a(zy)?c' -> FALSE

Note

  • A unit refers to a logical grouping done using ( ) or a character depending on its usage

  • If the above characters are escaped they lose their special meaning

Variations of ~

  1. !~ : Negated version of ~

  2. ~* : Case insensitive version of ~

  3. !~* : Negated version of case insensitive ~

Useful Commands at OCTO>

Command

Information

\q

Exits the prompt

\d

Displays all relations

\d tablename

Displays information about columns of specified table

\d viewname

Displays information about the specified view

\dv

Displays all views

\s

Displays Octo command history

Note

\d , \dv , \d tablename and \d viewname require a semi-colon to terminate the query. Newlines will not terminate \d queries.

Relation shown will be similar to the following:

OCTO> \d;
SCHEMA|NAME|TYPE|OWNER
public|INFORMATION_SCHEMA.TABLES|table|octo
public|NAMES|table|octo
public|NAMESWITHAGES|table|octo
public|OCTOONEROWTABLE|table|octo
public|PG_ATTRDEF|table|octo
public|PG_ATTRIBUTE|table|octo
public|PG_CATALOG.PG_ATTRDEF|table|octo
public|PG_CATALOG.PG_ATTRIBUTE|table|octo
public|PG_CATALOG.PG_CLASS|table|octo
public|PG_CATALOG.PG_DATABASE|table|octo
public|PG_CATALOG.PG_DESCRIPTION|table|octo
public|PG_CATALOG.PG_NAMESPACE|table|octo
public|PG_CATALOG.PG_PROC|table|octo
public|PG_CATALOG.PG_ROLES|table|octo
public|PG_CATALOG.PG_SETTINGS|table|octo
public|PG_CATALOG.PG_TYPE|table|octo
public|PG_CATALOG.PG_USER|table|octo
public|PG_CLASS|table|octo
public|PG_DATABASE|table|octo
public|PG_DESCRIPTION|table|octo
public|PG_NAMESPACE|table|octo
public|PG_PROC|table|octo
public|PG_ROLES|table|octo
public|PG_SETTINGS|table|octo
public|PG_TYPE|table|octo
public|PG_USER|table|octo
(26 rows)
OCTO> create view v1 as select * from names;
CREATE VIEW

OCTO> \dv;
Schema|Name|Type|Owner
public|V1|view|octo
OCTO> \d NAMES;
Table "NAMES"
Column|Type|Collation|Nullable|Default
ID|INTEGER||NOT NULL|
FIRSTNAME|VARCHAR(30)|||
LASTNAME|VARCHAR(30)|||
OCTO>
OCTO> \d v1;
View "V1"

Column|Type|Collation|Nullable|Default
ID|INTEGER||||
FIRSTNAME|VARCHAR(30)||||
LASTNAME|VARCHAR(30)||||
View definition:
create view v1 as select * from names;

\d tablename displays CHECK constraints, if defined.

Example:

OCTO> \d EMPLOYEE;
Table "EMPLOYEE"
Column|Type|Collation|Nullable|Default
ID|INTEGER||NOT NULL|
FIRSTNAME|VARCHAR(30)|||
LASTNAME|VARCHAR(30)|||
AGE|INTEGER|||
Check constraints:
    "EMPLOYEE_AGE_CHECK" CHECK ((AGE >= 18))

Technical Notes

Types of quotes accepted

Octo accepts three types of quote syntax:

  1. Single-quotes

  2. Double-quotes

  3. Backticks

Single-quotes are used to designate string literals, which Octo treats as belonging to the VARCHAR SQL type.

Double-quotes can be used in two different ways: as case-sensitive SQL identifiers and as string literals in and only in GLOBAL keyword definitions in CREATE TABLE (DDL) statements. Identifiers, whether case sensitive or not, are used to reference tables, columns, etc. For example, table names and column names are identifiers. String literals in GLOBAL keyword definitions contain M code that maps the given SQL table to a YottaDB global variable node.

Finally, the backtick character ("`") is used to enclose words so that any possible reserved words that may be used in column or table names are correctly escaped.

In addition, the backtick/backquote character also ensures the column name is treated as is and no case conversions are done. This lets us use the column name as is in the keys() or values() specifications. If the column name had not been enclosed inside double quotes or backquotes, the column name would be lower cased internally by Octo and the keys() or values() syntax would have to only specify the lower cased name.

Exceptions to BNF grammar

The following rule for a row_value_constructor is currently a deviation from BNF due to a Reduce-Reduce conflict in the grammar:

row_value_constructor : [(][value_expression | null_specification | default_specification] [, ....][)];

A primary value expression is denoted as follows:

value_expression: unsigned_value_specification | column_reference | COUNT (\*|[set_quantifier] value_expression) | general_set_function | scalar_subquery | (value_expression);

The value expression can contain an unsigned value, a column reference, a set function, a subquery or table_name.*

table_name.* usage:

  • When table_name.* is used, all columns of the table specified are included

  • It can be used in SELECT, GROUP BY, and ORDER BY column list

  • It can also be used with set functions in SELECT, HAVING and ORDER BY expressions

  • Apart from COUNT other set functions can have table_name.* only when the table has a single column and if its type is compatible with the function.

  • When COUNT( [set_quantifier] table_name.* ) is used as a column in SELECT, other columns have to either be present in GROUP BY or should be part of a set_function otherwise error is raised for the column not following this condition

  • When table_name.* is used with COUNT, all columns of the table are considered for processing. In case a row exists where all columns have artificial NULL values, COUNT(tablename.*) or COUNT(DISTINCT tablename.*) will not include the row in its result. We can end up with such a row when an outer join is used and there is no match for the right table, in this case the rows of the right table in the join will have only artificial NULL values.

  • Comparison between two dissimilar table_name.* values are not allowed

  • Comparison operations such as <, >, >=, <=, = and != between two table_name.* are carried out column-wise with the condition that NULL values are equal to NULL values and NULL values are greater than non-NULL values. Result of this type of operation is always a boolean value. In case of an outer join resulting in an entire row being NULL for one or both of the table_name.* operands then the comparison operation described previously will result in a NULL result. Such NULL values are sorted last by ORDER BY.

  • Comparison between a table_name.* and a NULL literal results in a NULL value. Comparison of table_name.* with any other literal type is invalid and an error is issued.

general_set_function refers to functions on sets like AVG, SUM, MIN, MAX etc. A set function can also contain the keyword COUNT, to count the number of resulting columns or rows that result from the query.

A query expression can be a joined table or a non joined query expression.

query_expression: non_join_query_expression | joined_table;

The non_join_query_expression includes simple tables and column lists.

Advanced Mapping of YottaDB Globals to Octo Tables

Northwind DDL Example

The following is a CREATE TABLE statement from the Northwind database adapted for Octo.

CREATE TABLE nwCustomers(
  CustomerID INTEGER PRIMARY KEY,
  CustomerName VARCHAR(48),
  ContactName VARCHAR(32),
  Address VARCHAR(64),
  City VARCHAR(32),
  PostalCode VARCHAR(16),
  Country VARCHAR(32)
)
GLOBAL "^nwCustomers";

In the above, the nwCustomers table maps data in nodes of the global variable ^nwCustomers. The columns of the primary key of the table are all subscripts of a global variable node (all columns in the primary key are global variable subscripts; all global variable subscripts are not necessarily columns, as shown by the next example). The ^nwCustomers global variable has one subscript, an integer mapping to the column CustomerID.

Columns such as CustomerName are pieces of the node, using the default "|" as the piece separator, in the order listed. If PIECE is not specified, Octo maps columns in the order in which they appear in the CREATE TABLE statement to consecutive pieces of the global node value.

VistA DDL Example 1

The following is a CREATE TABLE for the INDEX_DESCRIPTION table of a VistA environment. This illustrates how part of a global variable tree is mapped to a table, i.e., different parts of a different global variable tree can potentially be mapped to different tables.

CREATE TABLE `INDEX_DESCRIPTION`(
 `INDEX_ID` NUMERIC PRIMARY KEY START 0 END "'(keys(""index_id""))!(keys(""index_id"")="""")",
 `INDEX_DESCRIPTION_ID` NUMERIC KEY NUM 1 START 0 END "'(keys(""index_description_id""))!(keys(""index_description_id"")="""")",
 `DESCRIPTION` VARCHAR GLOBAL "^DD(""IX"",keys(""index_id""),.1,keys(""index_description_id""),0)"
    EXTRACT "$G(^DD(""IX"",keys(""index_id""),.1,keys(""index_description_id""),0))"
)
GLOBAL "^DD(""IX"",keys(""index_id""),.1,keys(""index_description_id""))";

The table has a numeric primary key, INDEX_ID. START 0 means that a $ORDER() loop to find the next subscript starts with 0 and END "'(keys(""index_description_id""))!(keys(""index_description_id"")="""")" means that the loop ends when the result of that $ORDER() is 0 or the empty string (""), indicating the end of breadth first traversal of that level of the tree.

GLOBAL "^DD(""IX"",keys(""index_id""),.1,keys(""index_description_id""))" means that the table is in multiple ^DD("IX",…,.1,…) subtrees of ^DD with the primary key INDEX_ID in the second subscript, and the INDEX_DESCRIPTION_ID column in the fourth subscript, with .1 as the third subscript. GLOBAL can also be applied at the COLUMN level to allow a table to incorporate columns from different global variables, with the restriction that KEY columns of a table must all be subscripts of the same global variable.

The DESCRIPTION column is a text field, whose value is the entire global variable node. Unlike the previous example, the global variable node is not piece separated columns. EXTRACT in a column specification overrides any implicit or explicit PIECE specification for that column.

The backtick character ("`") is used to enclose words so that any possible reserved words that may be used in column or table names are correctly escaped.

In addition, the backtick/backquote character also ensures the column name is treated as is and no case conversions are done. This lets us use the column name as is in the keys() specifications. If the column name had not been enclosed inside double quotes or backquotes, the column name would be lower cased internally by Octo and the keys() syntax would have to only specify the lower cased name.

VistA DDL Example 2

The following is another example from a VistA environment, automatically generated by the VistA Fileman to Octo DDL mapping tool.

CREATE TABLE `LINE_PORT_ADDRESS`(
 `LINE_PORT_ADDRESS_ID` NUMERIC PRIMARY KEY START 0 END "'(keys(""line_port_address_id""))!(keys(""line_port_address_id"")="""")",
 `NAME` CHARACTER(30) NOT NULL GLOBAL "^%ZIS(3.23,keys(""line_port_address_id""),0)" PIECE 1,
 `LOCATION` CHARACTER(30) GLOBAL "^%ZIS(3.23,keys(""line_port_address_id""),0)" PIECE 2,
 `DEVICE` INTEGER GLOBAL "^%ZIS(3.23,keys(""line_port_address_id""),0)" PIECE 3,
 `SUBTYPE` INTEGER GLOBAL "^%ZIS(3.23,keys(""line_port_address_id""),0)" PIECE 4
)
GLOBAL "^%ZIS(3.23,keys(""line_port_address_id""))"
DELIM "^";

DELIM "^" specifies to Octo that "^" is the piece separator to use when mapping values of global variable nodes into columns.

Retrieve Entire Node Example

To create a column that retrieves an entire database node, specify the empty string as the column's delimiter using the DELIM keyword, e.g.:

CREATE TABLE DELIMNAMES (
    id INTEGER PRIMARY KEY,
    firstName VARCHAR(30),
    lastName VARCHAR(30),
    middleInitial VARCHAR(1),
    age INTEGER,
    fullProfile VARCHAR DELIM ""
)
GLOBAL "^delimnames(keys(""id""))";

This DDL can be used to access data like the following:

YottaDB MUPIP EXTRACT
21-SEP-2018  14:55:45 ZWR
^delimnames(0)="Zero|Cool|B|25"
^delimnames(1)="Acid|Burn|I|22"
^delimnames(2)="Cereal|Killer|A|59"
^delimnames(3)="Lord|Nikon|O|"
^delimnames(4)="Joey|||42"
^delimnames(5)="Zero|Cool|B|25"

Given this DDL and data set, running SELECT * FROM delimnames; in Octo will yield the following output:

ID|FIRSTNAME|LASTNAME|MIDDLEINITIAL|AGE|FULLPROFILE
0|Zero|Cool|B|25|Zero|Cool|B|25
1|Acid|Burn|I|22|Acid|Burn|I|22
2|Cereal|Killer|A|59|Cereal|Killer|A|59
3|Lord|Nikon|O||Lord|Nikon|O|
4|Joey|||42|Joey|||42
5|Zero|Cool|B|25|Zero|Cool|B|25
(6 rows)

Column-level Delimiter Example

It is possible to define Octo globals with a variety of delimiters. To map data in such globals, you can use the column-level DELIM keyword. For example, consider the following DDL:

CREATE TABLE names (
    id       INTEGER,
    given    VARCHAR(15),
    surname  VARCHAR(15),
    street1  VARCHAR(50) GLOBAL "^names(keys(""id""),""Address"")" DELIM '^' PIECE 1,
    street2  VARCHAR(50) GLOBAL "^names(keys(""id""),""Address"")" DELIM '^' PIECE 2,
    city     VARCHAR(30) GLOBAL "^names(keys(""id""),""Address"")" DELIM '^' PIECE 3,
    province VARCHAR(30) GLOBAL "^names(keys(""id""),""Address"")" DELIM '^' PIECE 4,
    country  VARCHAR(30) GLOBAL "^names(keys(""id""),""Address"")" DELIM '^' PIECE 5,
    postal   VARCHAR(10) GLOBAL "^names(keys(""id""),""Address"")" DELIM '^' PIECE 6,
    primary key (id)
) GLOBAL "^names";

This DDL can be used to map global data like this:

YottaDB MUPIP EXTRACT /usr/library/V999_R139/pro/mupip extract -select=names names.zwr UTF-8
25-APR-2023  10:49:48 ZWR
^names(0)="Zero|Cool"
^names(0,"Address")="123 Any Lane^APT 2^Malvern^Pennsylvania^US^11000"
^names(1)="Acid|Burn"
^names(1,"Address")="2449 Brick Kiln Road^^Ely^County Antrim^UK^L3G 6JJ"
^names(2)="Cereal|Killer"
^names(2,"Address")="Drosselvænget 7852^^Sundby^Nordjylland^Denmark^89536"
^names(3)="Lord|Nikon"
^names(3,"Address")="Rue de la Gare 5883^^Aulnay-sous-Bois^Nord^France^31129"
^names(4)="Joey|"
^names(4,"Address")="5106 Sampige Rd^APT 5^Ghaziabad^Jammu and Kashmir^India^63988"
^names(5)="Zero|Cool"
^names(5,"Address")="Porodice Praizović 8163^^Bačka Topola^North Banat^Serbia^73571"

Note in particular the presence of the caret character (^) as a delimiter in the node values of this data set.

Given this data and the preceding DDL, running select * from names; in Octo will yield the following output:

ID|GIVEN|SURNAME|STREET1|STREET2|CITY|PROVINCE|COUNTRY|POSTAL
0|Zero|Cool|123 Any Lane|APT 2|Malvern|Pennsylvania|US|11000
1|Acid|Burn|2449 Brick Kiln Road||Ely|County Antrim|UK|L3G 6JJ
2|Cereal|Killer|Drosselvænget 7852||Sundby|Nordjylland|Denmark|89536
3|Lord|Nikon|Rue de la Gare 5883||Aulnay-sous-Bois|Nord|France|31129
4|Joey||5106 Sampige Rd|APT 5|Ghaziabad|Jammu and Kashmir|India|63988
5|Zero|Cool|Porodice Praizović 8163||Bačka Topola|North Banat|Serbia|73571
(6 rows)

Mixed Delimiter Example

Octo tables can also be mapped to YottaDB global variables using mixed row delimiters or using a combination of delimiters and JSON.

If you want to map an Octo table to global that uses mixed row delimiters, you can use a DDL like this:

CREATE TABLE authors(
    id INTEGER,
    title VARCHAR PIECE 1,
    author VARCHAR PIECE 2,
    author_firstname VARCHAR EXTRACT "$PIECE(values(""author""),""; "",2)",
    author_lastname  VARCHAR EXTRACT "$PIECE(values(""author""),""; "",1)",
    year INTEGER PIECE 3,
    PRIMARY KEY(id)
)
DELIM "^"
GLOBAL "^X"
READONLY;

This DDL will allow Octo to retrieve of data from YottaDB that is formatted like this:

YottaDB MUPIP EXTRACT /usr/library/V999_R139/pro/mupip extract -sel=^X x.zwr UTF-8
05-APR-2023  12:46:40 ZWR
^X(1)="Lijmen Het Been^Elschot; Willem^2014"
^X(2)="Jane Austen^Ichiro; Takahashi^1891"

Given this CREATE TABLE statement and ZWR data, running SELECT * FROM authors; in Octo will yield the following output:

ID|TITLE|AUTHOR|AUTHOR_FIRSTNAME|AUTHOR_LASTNAME|YEAR
1|Lijmen Het Been|Elschot; Willem|Willem|Elschot|2014
2|Jane Austen|Ichiro; Takahashi|Takahashi|Ichiro|1891
(2 rows)

The previous example can be re-written to remove the author column.

CREATE TABLE authors(
    id INTEGER,
    title VARCHAR PIECE 1,
    firstname VARCHAR EXTRACT "$PIECE($PIECE(^X(keys(""id"")),""^"",2),""; "",2)",
    lastname  VARCHAR EXTRACT "$PIECE($PIECE(^X(keys(""id"")),""^"",2),""; "",1)",
    year INTEGER PIECE 3,
    PRIMARY KEY(id)
)
DELIM "^"
GLOBAL "^X"
READONLY;

Running SELECT * FROM authors; against this version of the authors table in Octo will yield the following output:

ID|TITLE|FIRSTNAME|LASTNAME|YEAR
1|Lijmen Het Been|Willem|Elschot|2014
2|Jane Austen|Takahashi|Ichiro|1891

Delimiters with JSON Example

If you want to map an Octo table to global that uses delimiters and JSON, you can use a DDL like this:

CREATE TABLE specialone
(
    loc VARCHAR(30),
    dep VARCHAR(30),
    level INTEGER,
    title VARCHAR(30),
    lang VARCHAR(2)  EXTRACT "$SELECT($PIECE(^ZMYGLOBAL(keys(""loc""),keys(""dep""),keys(""level"")),""^"",2)=""N"":""dut"",1:$PIECE(^(keys(""level"")),""^"",2)",
    isbn VARCHAR(16) EXTRACT "$$^jsonField($PIECE(^ZMYGLOBAL(keys(""loc""),keys(""dep""),keys(""level"")),""^"",3),""isbn"")",
    co   INTEGER     EXTRACT "$$^jsonField($PIECE(^ZMYGLOBAL(keys(""loc""),keys(""dep""),keys(""level"")),""^"",3),""co"")",
    PRIMARY KEY (loc,dep,level)
)
GLOBAL "^ZMYGLOBAL"
DELIM "^"
READONLY;

This DDL will provide access to the following data:

YottaDB MUPIP EXTRACT /usr/library/V999_R139/pro/mupip extract -sel=^ZMYGLOBAL zmyglobal.zwr UTF-8
05-APR-2023  12:47:18 ZWR
^ZMYGLOBAL("antwerp","cde",14)="Lijmen^N^{""isbn"":9789025313210,""co"":1234}"
^ZMYGLOBAL("antwerp","qeb",9)="Austen^Y^{""isbn"":012345678234,""co"":9999}"

The jsonField M routine is defined as:

new decodeddata,error
do decode^%ydbwebjson($name(jsondata),$name(decodeddata),$name(error))
; Data that is numeric and looks like strings may need special treatment.
; Look at '\n' data and return that if it exists
if $data(decodeddata(field,"\n")) quit decodeddata(field,"\n")
quit $get(decodeddata(field))

A select * from specialone; query against the specialone table will yield the following output:

LOC|DEP|LEVEL|TITLE|LANG|ISBN|CO
antwerp|cde|14|Lijmen|dut|9789025313210|1234
antwerp|qeb|9|Austen|Y|012345678234|9999

SQL NULL Values

Octo treats every empty string ('') specified in a query as if NULL was instead specified. This differs from Postgres where empty strings and NULL are treated differently. Therefore queries that use empty strings will most likely need to be examined and reworded to instead use NULL.

For example, select * from names where lastname = '' is equivalent to select * from names where lastname = NULL. And since the check lastname = NULL will never evaluate to TRUE, the query should instead be reworded as select * from names where lastname is NULL to return the intended results.

SQL allows columns other than key columns to be NULL by default. Consider a YottaDB global node ^USAddress("White House")="1600 Pennsylvania Ave NW||Washingtion|DC|20500-0005" mapped to a table defined as follows:

CREATE TABLE USFamousAddresses(
  CommonName VARCHAR PRIMARY KEY,
  AddressLine1 VARCHAR,
  AddressLine2 VARCHAR,
  City VARCHAR,
  Territory VARCHAR(2),
  Zip VARCHAR(10)
)
GLOBAL "^USAddresses";

The second piece of the node, which corresponds to the AddressLine2 column, is an empty string ('' in SQL). In this case, Octo treats the AddressLine2 column as having a NULL value.