Basics of MongoDB

From Training Material
Jump to navigation Jump to search


Introduction

  • MongoDB is a fast document-oriented database
  • replaces the concept of a "row" with a more flexible model - the "document"
  • convenient data storage for modern object-oriented languages
  • no predefined schemas
  • no transactions
  • no SQL
  • supports indexes
  • easy to scale out horizontally


Getting Started

  • document is a basic unit of data (like row in RDBMS)
  • collection = table
  • multiple databases in single instance
  • _id in every document, unique within collection
  • JavaScript shell for administration and data manipulation


Documents

  • document is an ordered set of keys with associated values
  • representation of a document varies by programming language (map; hash - Perl, Ruby; dictionary - Python)
  • objects in JavaScript {key:value}
  • example {"company" : "NobleProg", "training" : "MongoDB for Developers"}
  • key is a string, any UTF-8 is allowed (except \0 . $)
  • type-sensitive {"age" : 3}, {"age" : "3"}
  • case-sensitive {"age" : 3}, {"Age" : 3}
  • documents cannot contain duplicated keys
  • key/value pairs are ordered {"x" : 1, "y" : 1} != {"y" : 1, "x" : 1}
    • order does not usually matter, MongoDB can reorder keys


{
  "_id" : ObjectId("545a414c7907b2a255b156c5"),
  "Name" : "Sean Connery",
  "Nationality" : "Great Britain",
  "BirthDate" : ISODate("1930-08-25T00:00:00Z"),
  "BirthYear" : 1930,
  "Occupation" : [
    "Actor",
    "Director",
    "Producer"
  ],
  "Movie" : [
    {
      "_id" : ObjectId("545a5f167907b2a255b156c7"),
      "Title" : "Dr. No"
    },
    {
      "_id" : ObjectId("545a5f317907b2a255b156c8"),
      "Title" : "From Russia with Love"
    },
    {
      "_id" : ObjectId("545a5ed67907b2a255b156c6"),
      "Title" : "Never Say Never Again"
    }
  ],
  "BirthPlace" : {
    "Country" : "United Kingdom, Scotland",
    "City" : "Edinburgh"
  }
}


Collections

  • a group of documents
  • dynamic schemas
    • {"company" : "NobleProg"}
    • {"age" : 5}
  • why should we use more than one collection?
    • nightmare for developers
    • much faster to get a list of collections than extracting document types from collections
    • grouping documents of the same kind
    • indexes
  • collection name is a string, any UTF-8 is allowed except:
    • empty string, start with "system." prefix, contain \0 character, $ character
  • subcollections separated by the . character
    • example GridFS (fs.files, fs.chunks)


Databases

  • database is a group of collections
  • one database = one application
  • separated databases for different applications, users
  • database name is a alphanumeric string, case sensitive, max 64 bytes, empty string is not allowed
  • database name will end up as file on filesystem (this explains restrictions)
  • special databases: admin (root database), local (never replicated), config (when sharding)
  • namespace is a concatenation of database and collection name (fully qualified collection name)
    • max 121 bytes, shold be less than 100


Getting and Starting MongoDB

  1. Installation on Windows
  2. Installation on Ubuntu


CRUD

Create

use NobleProg
person = {"Name" : "Sean Connery", "Nationality" : "Great Britain"}
db.people.insert(person) // .insert() is depricated now, use .insertOne() instead

Read

db.people.find()
db.people.findOne()
db.people.find().pretty() // .pretty() is the default now, so can be omitted

Update

personUp = {"Occupation" : "Actor"}
db.people.update({"Name" : "Sean Connery"}, {$set: personUp}) // .update() is depricated now, use .updateOne() instead
db.people.findOne()

Delete

db.people.remove({"Name" : "Sean Connery"}) // .remove() is depricated now, use .deleteOne() instead
db.people.remove({})
db.people.findOne()


Data Types

  • JSON-like documents
    • 6 data types: null, boolean, numeric, string, array, object
  • MongoDB adds support for other datatypes
    • null {"x" : null}
    • boolean {"x" : true}
    • number (by default 64-bit floating point numbers) {"x" : 1.4142}
      • 4-byte integers {"x" : NumberInt(141)}
      • 8-byte integers {"x" : LongInt(141)}
    • string (any UTF-8 character) {"x" : "NobleProg"}
    • date (stored as milliseconds from Linux epoch) {"x" : new Date()}
    • regular expressions (in queries) {"x" : /bob/i}
    • array {"x" : [1.4142, true, "training"]}
    • embedded documents {"x" : {"y" : 100}}
    • ObjectId {"x" : ObjectId("54597591bb107f6ef5989771")}
    • binary data (for non-UTF-8 strings)
    • code {"x" : function() {/*...*/}}


_id and ObjectId

  • every document in MongoDB must have _id
  • can be any type but it defaults to ObjectId
  • unique values in a single collection
  • ObjectId is designed to lightweight and easy to generate
    • 12-bytes of storage (24 hexidecimal digits)
      • timestamp - byte 0-3
      • machine - byte 4-6
      • PID - byte 7-8
      • increment - byte 9-11
    • _id is generated automatically if not present in document


MongoDB Shell

$ mongo
MongoDB shell version: 4.0.6
connecting to: test
>

$ mongo HostName:PortNumber/DatabaseName
$ mongo localhost:27017/test

$ mongo --nodb
> conn = new Mongo("localhost:27017")
connection to localhost27017
> db = conn.getDB("NobleProg")
NobleProg


Using help

  • mongo is a JavaScript shell, help is available in JavaScript on-line documentation
  • use built-in help for MongoDB-specific functionality
  • type function name without parentheses to see what the function is doing
> help
    db.help()            help on db methods
    db.mycoll.help()     help on collections methods
    ...
    exit                 quit mongo shell
>
> db.NobleProg.stats
function ( scale ){
    return this._db.runCommand( { collstats : this._shortName , scale : scale } );
}
>
> db.NobleProg.stats()
{ "ok" : 0, "errmsg" : "Collection [test.NobleProg] not found." }
>


Running Scripts

  • mongo can execute JavaScript files
  • scripts have access to all global variables (e.g. "db")
  • shell helpers (e.g. "show collections") do not work from files; use valid JavaScript equivalents (e.g. "db.getCollectionNames()")
  • use --quiet to hide "MongoDB shell version..." when executing script
  • use load() to run script directly from Mongo Shell
  • use .mongorc.js for frequently-loaded scripts
    • located in user home directory, run when starting up the shell, --norc to disable it
    • useful also to customize prompt
$ mongo script.js
MongoDB shell version: 4.0.6
connecting to: test
script.js was executed successfully!
$
$ mongo --quiet script.js
script.js was executed successfully!
$
$ mongo
MongoDB shell version: 4.0.6
connecting to: test
> load("script.js")
script.js was executed successfully!
>


Editing Complex Variables

  • limited multiline support in the shell
  • external editors are allowed
  • EDITOR="/usr/bin/gedit"
  • EDITOR="c:\\windows\\notepad.exe"
$ mongo --quiet
> EDITOR="c:\\windows\\notepad.exe"
> use training
switched to db training
> person = db.people.findOne()
{ "_id" : ObjectId("54568445cfc7c83518fa5430"), "Name" : "Sean Connery" }
> edit person
> person
{ "_id" : ObjectId("54568445cfc7c83518fa5430"), "Name" : "Sean Connery", "Nationality" : "Great Britain" }
> db.people.save(person)