CREATE TABLE

Synopsis

Use the CREATE TABLE statement to create a new table in a keyspace. It defines the table name, column names and types, primary key, and table properties.

Syntax

Diagram

create_table

CREATETABLEIFNOTEXISTStable_name(table_schema)table_properties

table_schema

,column_namecolumn_typePRIMARYKEYSTATICPRIMARYKEY((,column_name),column_name)

table_properties

WITHANDproperty_name=property_literalCLUSTERINGORDERBY(,column_nameASCDESC)COMPACTSTORAGE

Grammar

create_table ::= CREATE TABLE [ IF NOT EXISTS ] table_name
                     '(' table_element [ ',' table_element ...] ')'
                     [WITH table_properties];

table_element ::= table_column | table_constraints

table_column ::= column_name column_type [ column_constraint ...]

column_constraint ::= PRIMARY KEY | STATIC

table_constraints ::= PRIMARY KEY '(' partition_key_column_list clustering_key_column_list ')'

partition_key_column_list ::= '(' column_name [ ',' column_name ...] ')' | column_name

clustering_key_column_list ::= [ ',' column_name ...]

table_properties = [table_options]
                    [[AND] CLUSTERING ORDER BY '(' column_ordering_property [ ',' column_ordering_property ...] ')']
                    [[AND] COMPACT STORAGE]

table_options = property_name '=' property_literal [AND property_name '=' property_literal ...]

column_ordering_property ::= column_name [ ASC | DESC ]

Where

  • table_name, column_name, and property_name are identifiers (table_name may be qualified with a keyspace name).
  • property_literal is a literal of either boolean, text, or map data type.

Semantics

  • An error is raised if table_name already exists in the associated keyspace unless the IF NOT EXISTS option is used.

PRIMARY KEY

  • Primary key must be defined in either column_constraint or table_constraint but not in both of them.
  • Each row in a table is uniquely identified by its primary key.
  • Primary key columns are either partitioning columns or clustering columns (described below).
  • If primary key is set as a column constraint, then that column is the partition column and there are no clustering columns.
  • If primary key is set as a table constraint then:
    • The partition columns are given by the first entry in the primary key list: the nested column list (if given), otherwise the first column.
    • The clustering columns are the rest of the columns in the primary key list (if any).
  • Types MAP, SET, LIST, JSONB, USER_DEFINED_TYPE cannot be used in the primary key.

PARTITION KEY

  • Partition key is required and defines a split of rows into partitions.
  • Rows that share the same partition key form a partition and will be colocated on the same replica node.

CLUSTERING KEY

  • Clustering key is optional and defines an ordering for rows within a partition.
  • Default ordering is ascending (ASC) but can be set for each clustering column as ascending or descending using the CLUSTERING ORDER BY table property.

STATIC COLUMNS

  • Columns declared as STATIC will share the same value for all rows within a partition (that is, rows having the same partition key).
  • Columns in the primary key cannot be static.
  • A table without clustering columns cannot have static columns (without clustering columns the primary key and the partition key are identical so static columns would be the same as regular columns).

table_properties

  • The CLUSTERING ORDER BY property can be used to set the ordering for each clustering column individually (default is ASC).
  • The default_time_to_live property sets the default expiration time (TTL) in seconds for a table. The expiration time can be overridden by setting TTL for individual rows. The default value is 0 and means rows do not expire.
  • The transactions property specifies if distributed transactions are enabled in the table. To enable distributed transactions, use transactions = { 'enabled' : true }.
  • Use the AND operator to use multiple table properties.
  • The other YCQL table properties are allowed in the syntax but are currently ignored internally (have no effect).
  • The TABLETS = <num> property specifies the number of tablets to be used for the specified YCQL table. Setting this property overrides the value from the --yb_num_shards_per_tserver option. For an example, see Create a table specifying the number of tablets.
  • COMPACT STORAGE is only for syntax compatibility with Cassandra. It doesn't affect the underlying storage.

Examples

Use column constraint to define primary key

'user_id' is the partitioning column and there are no clustering columns.

ycqlsh:example> CREATE TABLE users(user_id INT PRIMARY KEY, full_name TEXT);

Use table constraint to define primary key

'supplier_id' and 'device_id' are the partitioning columns and 'model_year' is the clustering column.

ycqlsh:example> CREATE TABLE devices(supplier_id INT,
                                    device_id INT,
                                    model_year INT,
                                    device_name TEXT,
                                    PRIMARY KEY((supplier_id, device_id), model_year));

Use column constraint to define a static column

You can do this as shown below.

ycqlsh:example> CREATE TABLE items(supplier_id INT,
                                  item_id INT,
                                  supplier_name TEXT STATIC,
                                  item_name TEXT,
                                  PRIMARY KEY((supplier_id), item_id));
ycqlsh:example> INSERT INTO items(supplier_id, item_id, supplier_name, item_name)
               VALUES (1, 1, 'Unknown', 'Wrought Anvil');
ycqlsh:example> INSERT INTO items(supplier_id, item_id, supplier_name, item_name)
               VALUES (1, 2, 'Acme Corporation', 'Giant Rubber Band');
ycqlsh:example> SELECT * FROM items;
 supplier_id | item_id | supplier_name    | item_name
-------------+---------+------------------+-------------------
           1 |       1 | Acme Corporation |     Wrought Anvil
           1 |       2 | Acme Corporation | Giant Rubber Band

Use table property to define the order (ascending or descending) for clustering columns

Timestamp column 'ts' will be stored in descending order (latest values first).

ycqlsh:example> CREATE TABLE user_actions(user_id INT,
                                         ts TIMESTAMP,
                                         action TEXT,
                                         PRIMARY KEY((user_id), ts))
                                         WITH CLUSTERING ORDER BY (ts DESC);
ycqlsh:example> INSERT INTO user_actions(user_id, ts, action) VALUES (1, '2000-12-2 12:30:15', 'log in');
ycqlsh:example> INSERT INTO user_actions(user_id, ts, action) VALUES (1, '2000-12-2 12:30:25', 'change password');
ycqlsh:example> INSERT INTO user_actions(user_id, ts, action) VALUES (1, '2000-12-2 12:30:35', 'log out');
ycqlsh:example> SELECT * FROM user_actions;
 user_id | ts                              | action
---------+---------------------------------+-----------------
       1 | 2000-12-02 19:30:35.000000+0000 |         log out
       1 | 2000-12-02 19:30:25.000000+0000 | change password
       1 | 2000-12-02 19:30:15.000000+0000 |          log in

Use table property to define the default expiration time for rows

You can do this as shown below.

ycqlsh:example> CREATE TABLE sensor_data(sensor_id INT,
                                        ts TIMESTAMP,
                                        value DOUBLE,
                                        PRIMARY KEY((sensor_id), ts))
                                        WITH default_time_to_live = 5;

First insert at time T (row expires at T + 5).

ycqlsh:example> INSERT INTO sensor_data(sensor_id, ts, value) VALUES (1, '2017-10-1 11:22:31', 3.1);

Second insert 3 seconds later (row expires at T + 8).

ycqlsh:example> INSERT INTO sensor_data(sensor_id, ts, value) VALUES (2, '2017-10-1 11:22:34', 3.4);

First select 3 seconds later (at time T + 6).

ycqlsh:example> SELECT * FROM sensor_data;
 sensor_id | ts                              | value
-----------+---------------------------------+-------
         2 | 2017-10-01 18:22:34.000000+0000 |   3.4

Second select 3 seconds later (at time T + 9).

ycqlsh:example> SELECT * FROM sensor_data;
 sensor_id | ts | value
-----------+----+-------

Create a table specifying the number of tablets

You can use the CREATE TABLE statement with the WITH tablets = <num> clause to specify the number of tablets for a table. This is useful to scale the table up or down based on requirements. For example, for smaller static tables, it may be wasteful to have a large number of shards (tablets). In that case, you can use this to reduce the number of tablets created for the table. Similarly, for a very large table, you can use this statement to presplit the table into a large number of shards to get improved performance.

Note that YugabyteDB, by default, presplits a table in yb_num_shards_per_tserver * num_of_tserver shards. This clause can be used to override that setting on per-table basis.

ycqlsh:example> CREATE TABLE tracking (id int PRIMARY KEY) WITH tablets = 10;

If you create an index for these tables, you can also specify the number of tablets for the index.

You can also use AND to add other table properties, like in this example.

ycqlsh:example> CREATE TABLE tracking (id int PRIMARY KEY) WITH tablets = 10 AND transactions = { 'enabled' : true };

See also