How databases work: Part 2
In the first installment of the How databases work series we learned about the general architectural overview of a database and started working on our own clone (clone of a clone to be exact) of an SQLite database.
Following part two of the original article series, in this article we will lay the ground work for what will be our SQL compiler and our VM.
Breaking things up
As explained in the first article, SQLite is conceptually broken up into a front and a back-end. The front-end is the compiler, which takes SQL queries and outputs bytecode. The backend has several components, one of which is the VM, which executes bytecode supplied by the compiler.
So why do we have to break things up into separate components at all? Here is what the author of the original article series has to say about this:
Breaking things into two steps like this has a couple advantages:
- Reduces the complexity of each part (e.g. virtual machine does not worry about syntax errors)
- Allows compiling common queries (Prepared statements) once and caching the bytecode for improved performance.
Meta-commands
In SQL, commands that start with a dot (.) are called meta-commands. For instance, .exit
is a meta-command. We will the method doMetaCommand()
to our SQLite clone, that handles meta commands, if the input starts with a dot.
The doMetaCommand()
method returns a MetaCommandResult enum.
enum MetaCommandResult {
case META_COMMAND_SUCCESS
case META_COMMAND_UNRECOGNIZED_COMMAND
}
...
...
//READING INPUT
if input.hasPrefix(".") {
switch doMetaCommand(input) {
case .META_COMMAND_SUCCESS:
continue;
case .META_COMMAND_UNRECOGNIZED_COMMAND:
print("Unrecognized command \(input) \n")
continue;
}
}
Two new statements
We're also adding support for the insert
and the select
keywords. If the input given by the user was not a meta-command, we pass the input to the compiler. In our case the compiler will simply be a prepareStatement()
method, which takes the given input and returns a (PrepareStatementResult
, Statement?
) tuple. While in the original article a pointer to a Statement
struct is passed into the prepare_statement()
method, it seems "swiftier" to return a new Statement
struct from the prepareStatement()
method rather than to pass in a Statement
as in-out parameter and overwrite its contents.
enum PrepareStatementResult {
case PREPARE_STATEMENT_SUCCESS
case PREPARE_STATEMENT_UNRECOGNIZED_STATEMENT
}
enum StatementType {
case STATEMENT_INSERT
case STATEMENT_SELECT
}
struct Statement {
var type: StatementType;
}
...
...
while(true){
...
...
if input.hasPrefix(".") {
...
...
}
//PREPARING STATEMENT AKA COMPILING SQL
let statement: Statement
switch prepareStatement(input) {
case let (.PREPARE_STATEMENT_SUCCESS, state):
statement = state!;
break;
case (.PREPARE_STATEMENT_UNRECOGNIZED_STATEMENT, _):
print("Unrecognized keyword at start of \(input) \n")
continue
}
}
Executing
We also add an executeStatement()
method. This can be thought of as the VM. It takes a statement and does something depending on what kind of statement it is.
The execute statement simply switches on the type of the passed Statement.type
and takes appropriate action.
func executeStatement(_ statement: Statement) {
switch statement.type {
case .STATEMENT_INSERT:
print("Inserting data into database")
case .STATEMENT_SELECT:
print("Querying data from database")
}
}
...
...
while(true){
...
...
executeStatement(statement)
print("Executed. \n")
This is what our database looks like after implementing all the new features of this article:
//MARK: TYPES
enum ExitResult: Int32 {
case EXIT_SUCCESS = 0
case EXIT_FAILURE
}
enum MetaCommandResult {
case META_COMMAND_SUCCESS
case META_COMMAND_UNRECOGNIZED_COMMAND
}
enum PrepareStatementResult {
case PREPARE_STATEMENT_SUCCESS
case PREPARE_STATEMENT_UNRECOGNIZED_STATEMENT
}
enum StatementType {
case STATEMENT_INSERT
case STATEMENT_SELECT
}
struct Statement {
var type: StatementType;
}
//MARK: FUNCTIONS
func printPrompt() {
print("db >")
}
func readInput() -> String {
if let line = readLine() {
return line
} else {
return ""
}
}
func doMetaCommand(_ input: String) -> MetaCommandResult {
if input == ".exit" {
exit(ExitResult.EXIT_SUCCESS.rawValue);
} else {
return .META_COMMAND_UNRECOGNIZED_COMMAND
}
}
func prepareStatement(_ input: String) -> (PrepareStatementResult, Statement?) {
if input.lowercased().hasPrefix("insert") {
let statement = Statement(type: .STATEMENT_INSERT)
return (.PREPARE_STATEMENT_SUCCESS, statement)
}
if input.lowercased().hasPrefix("select") {
let statement = Statement(type: .STATEMENT_SELECT)
return (.PREPARE_STATEMENT_SUCCESS, statement)
}
return (.PREPARE_STATEMENT_UNRECOGNIZED_STATEMENT, nil)
}
func executeStatement(_ statement: Statement) {
switch statement.type {
case .STATEMENT_INSERT:
print("Inserting data into database")
case .STATEMENT_SELECT:
print("Querying data from database")
}
}
while(true){
printPrompt()
let input = readInput();
//READING INPUT
if input.hasPrefix(".") {
switch doMetaCommand(input) {
case .META_COMMAND_SUCCESS:
continue;
case .META_COMMAND_UNRECOGNIZED_COMMAND:
print("Unrecognized command \(input) \n")
continue;
}
}
//PREPARING STATEMENT AKA COMPILING SQL
let statement: Statement
switch prepareStatement(input) {
case let (.PREPARE_STATEMENT_SUCCESS, state):
statement = state!;
break;
case (.PREPARE_STATEMENT_UNRECOGNIZED_STATEMENT, _):
print("Unrecognized keyword at start of \(input) \n")
continue
}
//EXECUTING STATEMENT AKA THE VM
executeStatement(statement)
print("Executed. \n")
}
This was the second installment of How databases work. In the next post we will create columns and rows and add actual data to our database.