Database Intro & Basic SQL


  • A database is a large, organized collection of data.
  • A Database Management System (DBMS) is software that stores, manages and facilitates access to databases.
    • 传统DBMS即关系型数据库管理系统 Relational Database Management Systems (RDBMS)
    • SQL 数据描述和操作语言
    • ACID 数据库事务(Transaction)正确执行的四个基本要素的缩写:原子性(Atomicity)、一致性(Consistency)、隔离性(Isolation)、持久性(Durability)
  • Various levels of a database system

  • 关系型数据库的数据独立性
    • Logical data independence
    • Physical data independence

Environment Setup

  • docker pull continuumio/anaconda3
  • docker run -it -p 8888:8888 continuumio/anaconda3 /bin/bash
  • jupyter notebook --ip= --allow-root
  • In order to %load_ext sql in IPython, we do pip install ipython-sql

Concepts of SQL

  • SQL (Structured Query Language)
  • A table is also called relation.
  • An attribute is also a column.
  • A tuple is also a row / record.
  • The number of tuples is the cardinality of the relation.
  • The number of attributes is the arity of the relation.
  • Each attribute must have an atomic type
    • Characters: CHAR(20), VARCHAR(50)
    • Others: MONEY, DATETIME
  • The schema of a table is the table name, its attributes, and their types:
  • A key is an attribute whose values are unique; we underline a key
    • A key is a minimal subset of attributes that acts as a unique identifier for tuples in a relation
  • SQL commands are case insensitive.
  • Values are case sensitive.
  • We use single quotes for string constants.

Basic Commands

  • Check table enfo
    • %sql PRAGMA table_info(precipitation_full);
  • Create or Delete
    • %sql CREATE TABLE table_name(attribute TYPE [PRIMARY] [KEY], ...)
      • To support a primary key, we can also use PRIMARY KEY (attr1, attr2)
    • %sql DROP TABLE IF EXISTS table_name
    • When we need FOREIGN KEY, we run %sql PRAGMA foreign_keys = ON
      • As an attribute, FOREIGN KEY (attr) REFERENCES table_name(attr). – Deleting an item referred as foreign key will fail.

  • Insert
    • INSERT INTO table_name VALUES(row_info);
    • When we use a variable, use : before the variable.
  • Query

  • “LIKE”: simple string pattern matching
    • % = any sequence of characters
    • _ = any single character


  • In WHERE part, we can use equation like col1 = col2.
  • We can also use JOIN command(with equivalence):

  • If there are same attributes in tables to be joined. We need to solve variable ambiguity.
    • In FROM part, set the table variable. FROM Person p, Company c, so that we can use p.attr, c.attr to indicate the specific attribute.
    • Or, we can also use table_name.attr to indicate different attributes in different tables.
  • Semantics of a join:(not a real execution order)
    1. Take cross product, X = R × S
    2. Apply selections / conditions, Y = (r, s)∈X|r.A = =r.B
    3. Apply projections to get final output, Z = (y.A, )fory ∈ Y

  • Semantics of query
    1. FROM: compute cross product of tables
    2. WHERE: check conditions, discard tuples that fail
    3. SELECT: specify desired fields in output
    4. DISTINCT: eliminate duplicate rows

Leave a Reply

Your email address will not be published.