Sorting a list of English words is simple enough because they rely on alphabetical ordering. Sorting a set of German, or French words, with all of their accents, or Chinese with their different characters is a lot harder. Sorting rules are specified through locales, which determine how accents are sorted, in which order the characters are in, and how to do case-insensitive sorting.
In the last couple of blogs we learned how to specify collation rules for a collection or view in Navicat for MongoDB. Today we're going to see collation rules in action by sorting two collections with the same data but defined using different collation rules.
The Test Data
For our test we'll use the following three words: 'boffey', 'bøhm', and 'brown'. Using the American English (en_US) locale, they will be sorted as:
- boffey
- bøhm
- brown
Meanwhile, sorting according to the nb (Norwegian) locale, will reverse 'brown' and 'bøhm':
- boffey
- brown
- bøhm
Creating the Collections
In Navicat, selecting your database in the Objects pane will display the Objects toolbar with the New Collection button enabled:
Clicking it will bring up a new Untitled Collection tab, along with its own toolbar. Click on the Collation tab to set the collation rules. For our purposes, all you need to do is select the "en_US" item from the Locale dropdown and hit the Save button. That'll bring up a dialog where you can provide a name for the collection. Call this one "sort_en_us":
Upon saving the collection, the remaining collation rules will change to their defaults:
Now we're ready to add the documents.
Double click our new collection in the Objects pane to bring up the data. To enter a new document, click on the button with the Plus sign in the bottom-left corner:
That will display the Add Document dialog. There, you can provide the "name" field and "bøhm" value:
Clicking the Add button appends your new document to the collection.
Repeat that process again to enter the "boffey" and "brown" values.
Next, create another collection named "sort_norwegian". This time, choose "nb" from the Locale dropdown. Be sure to enter the data in the same order so that both our datasets are identical.
Sorting the Collections
With our two test collections in place, we're ready to sort them.
To do that, open the sort_en_us collection and click the Sort button on the toolbar. That will open a new pane above the data where you can define the sort criteria. To add a sort field, click on the Plus sign button. The _id field will be set by default. To change it, click the field name and choose the name field from the list. Finally, apply the sort criteria by clicking the check mark button. Your data should now look as follows:
Do the same for the sort_norwegian collection and notice the different results:
And that, dear readers, is collation at work!