Basics of MongoDB

From Training Material
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.


Introduction

  • MongoDB is a fast document-oriented database
  • replaces the concept of a "row" with a more flexible model - the "document"
  • convenient data storage for modern object-oriented languages
  • no predefined schemas
  • no transactions
  • no SQL
  • supports indexes
  • easy to scale out horizontally


Getting Started

  • document is a basic unit of data (like row in RDBMS)
  • collection = table
  • multiple databases in single instance
  • _id in every document, unique within collection
  • JavaScript shell for administration and data manipulation


Documents

  • document is an ordered set of keys with associated values
  • representation of a document varies by programming language (map; hash - Perl, Ruby; dictionary - Python)
  • objects in JavaScript {key:value}
  • example {"company" : "NobleProg", "training" : "MongoDB for Developers"}
  • key is a string, any UTF-8 is allowed (except \0 . $)
  • type-sensitive {"age" : 3}, {"age" : "3"}
  • case-sensitive {"age" : 3}, {"Age" : 3}
  • documents cannot contain duplicated keys
  • key/value pairs are ordered {"x" : 1, "y" : 1} != {"y" : 1, "x" : 1}
    • order does not usually matter, MongoDB can reorder keys


{
  "_id" : ObjectId("545a414c7907b2a255b156c5"),
  "Name" : "Sean Connery",
  "Nationality" : "Great Britain",
  "BirthDate" : ISODate("1930-08-25T00:00:00Z"),
  "BirthYear" : 1930,
  "Occupation" : [
    "Actor",
    "Director",
    "Producer"
  ],
  "Movie" : [
    {
      "_id" : ObjectId("545a5f167907b2a255b156c7"),
      "Title" : "Dr. No"
    },
    {
      "_id" : ObjectId("545a5f317907b2a255b156c8"),
      "Title" : "From Russia with Love"
    },
    {
      "_id" : ObjectId("545a5ed67907b2a255b156c6"),
      "Title" : "Never Say Never Again"
    }
  ],
  "BirthPlace" : {
    "Country" : "United Kingdom, Scotland",
    "City" : "Edinburgh"
  }
}


Collections

  • a group of documents
  • dynamic schemas
    • {"company" : "NobleProg"}
    • {"age" : 5}
  • why should we use more than one collection?
    • nightmare for developers
    • much faster to get a list of collections than extracting document types from collections
    • grouping documents of the same kind
    • indexes
  • collection name is a string, any UTF-8 is allowed except:
    • empty string, start with "system." prefix, contain \0 character, $ character
  • subcollections separated by the . character
    • example GridFS (fs.files, fs.chunks)


Databases

  • database is a group of collections
  • one database = one application
  • separated databases for different applications, users
  • database name is a alphanumeric string, case sensitive, max 64 bytes, empty string is not allowed
  • database name will end up as file on filesystem (this explains restrictions)
  • special databases: admin (root database), local (never replicated), config (when sharding)
  • namespace is a concatenation of database and collection name (fully qualified collection name)
    • max 121 bytes, shold be less than 100


Getting and Starting MongoDB

  1. Installation on Windows
  2. Installation on Ubuntu


CRUD

Create

use NobleProg
person = {"Name" : "Sean Connery", "Nationality" : "Great Britain"}
db.people.insert(person) // .insert() is depricated now, use .insertOne() instead

Read

db.people.find()
db.people.findOne()
db.people.find().pretty() // .pretty() is the default now, so can be omitted

Update

personUp = {"Occupation" : "Actor"}
db.people.update({"Name" : "Sean Connery"}, {$set: personUp}) // .update() is depricated now, use .updateOne() instead
db.people.findOne()

Delete

db.people.remove({"Name" : "Sean Connery"}) // .remove() is depricated now, use .deleteOne() instead
db.people.remove({})
db.people.findOne()


Data Types

  • JSON-like documents
    • 6 data types: null, boolean, numeric, string, array, object
  • MongoDB adds support for other datatypes
    • null {"x" : null}
    • boolean {"x" : true}
    • number (by default 64-bit floating point numbers) {"x" : 1.4142}
      • 4-byte integers {"x" : NumberInt(141)}
      • 8-byte integers {"x" : LongInt(141)}
    • string (any UTF-8 character) {"x" : "NobleProg"}
    • date (stored as milliseconds from Linux epoch) {"x" : new Date()}
    • regular expressions (in queries) {"x" : /bob/i}
    • array {"x" : [1.4142, true, "training"]}
    • embedded documents {"x" : {"y" : 100}}
    • ObjectId {"x" : ObjectId("54597591bb107f6ef5989771")}
    • binary data (for non-UTF-8 strings)
    • code {"x" : function() {/*...*/}}


_id and ObjectId

  • every document in MongoDB must have _id
  • can be any type but it defaults to ObjectId
  • unique values in a single collection
  • ObjectId is designed to lightweight and easy to generate
    • 12-bytes of storage (24 hexidecimal digits)
      • timestamp - byte 0-3
      • machine - byte 4-6
      • PID - byte 7-8
      • increment - byte 9-11
    • _id is generated automatically if not present in document


MongoDB Shell

$ mongo
MongoDB shell version: 4.0.6
connecting to: test
>

$ mongo HostName:PortNumber/DatabaseName
$ mongo localhost:27017/test

$ mongo --nodb
> conn = new Mongo("localhost:27017")
connection to localhost27017
> db = conn.getDB("NobleProg")
NobleProg


Using help

  • mongo is a JavaScript shell, help is available in JavaScript on-line documentation
  • use built-in help for MongoDB-specific functionality
  • type function name without parentheses to see what the function is doing
> help
    db.help()            help on db methods
    db.mycoll.help()     help on collections methods
    ...
    exit                 quit mongo shell
>
> db.NobleProg.stats
function ( scale ){
    return this._db.runCommand( { collstats : this._shortName , scale : scale } );
}
>
> db.NobleProg.stats()
{ "ok" : 0, "errmsg" : "Collection [test.NobleProg] not found." }
>


Running Scripts

  • mongo can execute JavaScript files
  • scripts have access to all global variables (e.g. "db")
  • shell helpers (e.g. "show collections") do not work from files; use valid JavaScript equivalents (e.g. "db.getCollectionNames()")
  • use --quiet to hide "MongoDB shell version..." when executing script
  • use load() to run script directly from Mongo Shell
  • use .mongorc.js for frequently-loaded scripts
    • located in user home directory, run when starting up the shell, --norc to disable it
    • useful also to customize prompt
$ mongo script.js
MongoDB shell version: 4.0.6
connecting to: test
script.js was executed successfully!
$
$ mongo --quiet script.js
script.js was executed successfully!
$
$ mongo
MongoDB shell version: 4.0.6
connecting to: test
> load("script.js")
script.js was executed successfully!
>


Editing Complex Variables

  • limited multiline support in the shell
  • external editors are allowed
  • EDITOR="/usr/bin/gedit"
  • EDITOR="c:\\windows\\notepad.exe"
$ mongo --quiet
> EDITOR="c:\\windows\\notepad.exe"
> use training
switched to db training
> person = db.people.findOne()
{ "_id" : ObjectId("54568445cfc7c83518fa5430"), "Name" : "Sean Connery" }
> edit person
> person
{ "_id" : ObjectId("54568445cfc7c83518fa5430"), "Name" : "Sean Connery", "Nationality" : "Great Britain" }
> db.people.save(person)