Overview
In this document we explore some of the meta parts of BlueQL. If you want to look at how you can use BlueQL, consider looking at the sections that follow.
Design principles:
- Simplicity and clarity: The language shouldn't be overwhelming to understand
- Security with mandatory parameterization: We want to reduce the surface of injection attacks. For this reason, parameterization is mandatory.
Just like SQL, BlueQL has three categories of commands/queries inside it:
- DDL: Data definition language is used to define, modify and/or remove DDL objects such as
space
s andmodel
s - DCL: Data control language is used to control the access to data, and perform other administrative tasks
- DML: Data manipulation language is used to manipulate data
Jump to differences from SQL.
This text is not a detailed, formal guide. It's meant for developers and users who want to work with Skytable. If you need a more formal specification, like a grammar definition, please ask us, and we'll create it. We haven't published it yet because no one has requested it.
Identifiers
Can begin with any ASCII alphabet or an underscore (_
) and then have any number of alphanumeric characters and/or underscores.
Keywords
Keywords are identifiers with special meanings and hence can't be used as identifiers in other places. Here's a full-list of keywords:
[
"sysctl", "create", "alter", "drop", "use", "inspect", "describe", "insert", "select", "update",
"delete", "exists", "table", "model", "space", "index", "type", "function", "rename", "add",
"remove", "transform", "set", "return", "order", "sort", "group", "limit", "asc", "desc", "all",
"by", "with", "on", "from", "into", "as", "to", "in", "of", "and", "or", "not", "if", "else",
"where", "when", "allow", "auto", "default", "null", "transaction", "batch", "lock", "read",
"write", "begin", "end", "key", "value", "primary", "truncate"
]
Data types
Boolean
A boolean value, either true
or false
Unsigned integers
uint8
: unsigned 8-bit integeruint16
: unsigned 16-bit integeruint32
: unsigned 32-bit integeruint64
: unsigned 64-bit integer
Signed integers
sint8
: signed 8-bit integersint16
: signed 16-bit integersint32
: signed 32-bit integersint64
: signed 64-bit integer
Floating point values
float32
: a single-precision floatfloat64
: a double-precision float
Simple collections
binary
: a binary blob represented as a sequence ofuint8
valuesstring
: an UTF-8 string
Complex collections
list
: a list of any of the data types, including nested lists- A list is represented as:
[]
with values inbetween. For example, alist { type:string }
would be represented as:["sayan", "loves", "dogs"]
- Lists cannot contain null values
- List can be nested: You can have heavily nested lists like:
[[[]], [["another one"]]]
- List can only have one base type: This means that if you have a list like
[[[string]]]
each element must either be the same nested list, or an empty list
- A list is represented as:
New data types are frequently added, so treat this list as non-exhaustive.
Literals
- Null literal:
null
- Numeric literals:
- Unsigned:
1234
- Signed:
[-]1234
- Unsigned:
- Floating point literals:
[-]1234.5678
- String literals:
"hello"
- Binary literals: `binary`
- List literals:
[..]
- Dictionaries:
{<ident>: <literal>}
It is very important for you to know that literals are not allowed everywhere. The only literals allowed everywhere are:
- Lists
- Dictionaries
Read below to understand why.
Parameters
All literals apart from dictionaries and lists must be used as parameters. BlueQL only allows literals as parameters. For example, using the Rust client, if you were to run this:
insert into myspace.mymodel('sayan', 'pass123', ['myfirstnote'])
You are required to parameterize the query like this:
use skytable::query;
let query = query!("insert into myspace.mymode(?, ?, [?])", "sayan", "pass123", "myfirstnote")
If you try to run it without any parameters (don't forget that skysh
automatically parameterizes for convenience) the query
won't even compile.
Just so you know, parameterizing involves passing a separate parameter list, with each parameter encoded. You wouldn't need to worry about this because the client driver does all of that for you!
The question is why? The answer is security. SQL-injection vulernabilties have already costed companies a lot, so we don't want to inherit that from SQL.
Also, parameterization is exclusively possible for literals. This means that whenever you're accepting data from an untrusted source, it becomes a parameter. If you try to not use parameters, the query will not even compile.
On a final note, BlueQL doesn't support comments of any form also for security reasons.
Expressions
+=
: add RHS to LHS- Can be used outside arithmetic contexts
- Add a char to a field
mystring
:mystring += ",world"
- Add a list to a nested list field:
mylist += ["item in nested list"]
-=
: subtract RHS from LHS/=
: divide LHS by RHS*=
: multiply LHS by RHS
DDL
Queries include:
- Spaces:
CREATE SPACE myspace [WITH { property: value, ... }]
ALTER SPACE myspace [WITH { property: updated_value, ... }]
DROP SPACE [allow not empty] myspace
- Models:
CREATE MODEL myspace.mymodel([primary] [null] field: field_type, ...) [WITH { property: value, ... }]
ALTER MODEL myspace.mymodel (ADD ... | UPDATE ... | REMOVE ...)
DROP MODEL [allow not empty] myspace.mymodel
INSPECT GLOBAL
: inspects global state, shows a list of spaces and usersINSPECT SPACE <space>
: inspects a single space, shows a list of models and other thingsINSPECT MODEL <model>
: inspects a single model, shows information about stored data and other things
DML
INSERT INTO myspace.mymodel(...)
SELECT col1, ... FROM myspace.mymodel WHERE ...
UPDATE myspace.mymodel SET counter += 1 WHERE ...
DELETE FROM myspace.mymode WHERE ...
DCL
Queries include:
SYSCTL REPORT STATUS
: returns the status of the system. (Not a control query per se)SYSCTL CREATE USER "username" WITH { password: ... }
: create a new userSYSCTL DROP USER "username"
: removes the user in question
Differences from SQL
- No literals (see above)
- Mandatory parameterization (see above)
- No semicolons!
- Only one statement per query. For multiple statements batches must be used
- DML queries are point queries (hence must contain a
WHERE
clause)