Querying

Distributing queries to slaves

Note: 1.1.0+

If you are using a » replica set and version 1.1.0 or above of the driver, the driver can automatically route reads to slaves. This behavior does not exist in earlier versions of the driver and cannot be used with "normal" master-slave.

By default, the driver will send all queries to the master. If you set the "slaveOkay" option, the driver will send all queries to a non-primary server, if possible. The "slaveOkay" option can be set at every "level": connection, database, collection, and cursor. Each class inherits the "slaveOkay" setting from the class above it, so if you do:

<?php

$db
->setSlaveOkay(true);
$c $db->myCollection;

$cursor $c->find();

?>

then the query will be executed against a slave (the collection inherited "slaveOkay" from the database and the cursor inherited it from the collection).

How slaves are chosen

Each instance of Mongo chooses its own slave using the available slave with the lowest ping time. So, if we had a PHP client in Europe and one in Australia and we had one secondary in each of these data centers, we could do:

<?php

// P is the primary

// on the Australian client
$m1 = new Mongo("mongodb://P", array("replicaSet" => true));
$m1->foo->bar->find()->slaveOkay()->getNext();
echo 
"m1's slave is ".$m1->getSlave()."\n";

// on the European client
$m2 = new Mongo("mongodb://P", array("replicaSet" => true));
$m2->foo->bar->find()->slaveOkay()->getNext();
echo 
"m2's slave is ".$m2->getSlave()."\n";

?>

we'd probably end up with something like:


m1's slave is: australianHost
m2's slave is: europeanHost

Note that we have to do a query before a slave is chosen: slaves are chosen lazily by the driver. Mongo::getSlave() will return NULL until a slave is used.

You can see what the driver thinks is the current status of the set members by running Mongo::getHosts().

If no non-primary server is readable, the driver will send reads to the primary (even if "slaveOkay" is set). A server is considered readable if its state is 2 (SECONDARY) and its health is 1. You can check this with Mongo::getHosts().

If you enjoy twiddling knobs that you probably shouldn't mess with, you can request the driver to use a different slave by calling Mongo::switchSlave(). This may choose a new slave (if one is available) and shouldn't be used unless you know what you're doing.

Random notes

Writes are always sent to the primary. Database commands, even read-only commands, are also always sent to the primary.

The health and state of a slave is checked every 5 seconds or when the next operation occurs after 5 seconds. It will also recheck the configuration when the driver has a problem reaching a server.

Note that a non-primary server may be behind the primary in operations, so your application must be okay with getting out-of-date data (or you must use w for all writes).

Querying by _id

Every object inserted is automatically assigned a unique _id field, which is often a useful field to use in queries.

Suppose that we wish to find the document we just inserted. Inserting adds and _id field to the document, so we can query by that:

<?php

$person 
= array("name" => "joe");

$people->insert($person);

// now $joe has an _id field
$joe $people->findOne(array("_id" => $person['_id']));

?>

Unless the user has specified otherwise, the _id field is a MongoId. The most common mistake is attepting to use a string to match a MongoId. Keep in mind that these are two different datatypes, and will not match each other in the same way that the string "array()" is not the same as an empty array. For example:

<?php

$person 
= array("name" => "joe");

$people->insert($person);

// convert the _id to a string
$pid $person['_id'] . "";

// FAILS - $pid is a string, not a MongoId
$joe $people->findOne(array("_id" => $pid));

?>

Arrays

Arrays are special in a couple ways. First, there are two types that MongoDB uses: "normal" arrays and associative arrays. Associative arrays can have any mix of key types and values. "Normal" arrays are defined as arrays with ascending numeric indexes starting at 0 and increasing by one for each element. These are, typically, just your usual PHP array.

For instance, if you want to save a list of awards in a document, you could say:

<?php

$collection
->save(array("awards" => array("gold""silver""bronze")));

?>

Queries can reach into arrays to search for elements. Suppose that we wish to find all documents with an array element of a given value. For example, documents with a "gold" award, such as:

{ "_id" : ObjectId("4b06c282edb87a281e09dad9"), "awards" : ["gold", "silver", "bronze"]}

This can be done with a simple query, ignoring the fact that "awards" is an array:

<?php

  $cursor 
$collection->find(array("awards" => "gold"));

?>

Suppose we are querying for a more complex object, if each element of the array were an object itself, such as:

{
     "_id" : ObjectId("4b06c282edb87a281e09dad9"),
     "awards" :
     [
        {
            "first place" : "gold"
        },
        {
            "second place" : "silver"
        },
        {
            "third place" :  "bronze"
        }
     ]
}

Still ignoring that this is an array, we can use dot notation to query the subobject:

<?php

$cursor 
$collection->find(array("awards.first place" => "gold"));

?>

Notice that it doesn't matter that there is a space in the field name (although it may be best not to use spaces, just to make things more readable).

You can also use an array to query for a number of possible values. For instance, if we were looking for documents "gold" or "copper", we could do:

<?php

$cursor 
$collection->find(array("awards" => array('$in' => array("gold""copper"))));

?>