Positron Developer's Guide: Working with the Database

The Neuros has several on-disk databases that are used by both the device during normal operation and the host computer during synchronization. The primary databases are:

audio: Audio files stored on the Neuros
pcaudio: Audio files stored on the host computer
unidedhisi: HiSi clips that have not been identified yet. These clips should be fingerprinted and looked up on the HiSi server during synchronization.
idedhisi: HiSi clips that have been identified. If the fingerprint is successfully located on the HiSi server, the database record corresponding to the clip should be removed from unidedhisi and put into this database along with the metadata returned from the server.
failedhisi: HiSi clips that could not be identified. If the lookup fails, the record from the HiSi clip should be moved from the unidedhisi database to this database.

Each database can be thought of as a collection of records with a fixed number (greater than or equal to 1) of fields. The first field is the primary field. Next are zero or more fields called access keys, which are fields whose contents are indexed. All the records that contain a particular value in an access key field can be quickly looked up. (Example: finding all the songs with the genre "Rock") Finally, the access keys are followed by zero or more extra info fields. These fields contain data that does not need to be indexed, like filenames or file sizes. Some fields may also contain a collection of values, called a "bag." This is used in the audio database to allow one file to be in multiple playlists at once (i.e. have multiple values in its playlist field). Every database is required to have a special null record.

Design

The database structure described above is implemented as a tree of databases using a root database (audio, unidedhisi, etc.) with a child database (artist, genre, etc.) for each access key. When a record is added to the root, the actual contents of each access key field are replaced with a pointer to the record in the child db containing the value in its primary field. Of course, if the value doesn't already exist in the child database, it must be added. Null values are possible for access keys; just use a pointer to the null record in the child database. The following diagram shows how this works:

File Layout

Each root database is stored in a directory whose name is the same as the database name. Inside that directory are two files holding the contents of the root database, a MDB file and a SAI file. Child databases have a MDB and SAI file as well as a PAI file used for reverse lookups. The name of each of these files will be the name of the database with either a ".mdb," ".sai," or ".pai" extension. All the files for the root and child databases are stored in the same directory. The file layout for the audio database is:

audio/albums.mdb - Album child database
audio/albums.pai
audio/albums.sai
audio/artist.mdb - Artist child database
audio/artist.pai
audio/artist.sai
audio/audio.mdb - Root database
audio/audio.sai
audio/genre.mdb - Genre child database
audio/genre.pai
audio/genre.sai
audio/playlist.mdb - Playlist child database
audio/playlist.pai
audio/playlist.sai
audio/recordings.mdb - Recordings child database
audio/recordings.pai
audio/recordings.sai

Data Packing

Because of the nature of the DSP used in the Neuros, all database files are treated as a sequence of 16-bit words. This creates some packing issues that need to be considered when storing data in database files. This is handled in the following way for the 5 major types of data:

Bit-fields: Bit-fields are usually 16 bits and are stored in big-endian byte order.
Integers: Integers are 16 or 32 bits and also stored in big-endian byte order.
Pointers: Pointers are 32-bit integers that point to offsets in a file, rather than memory locations. However, unlike the pointers most programmers are used to, these pointers point at 16-bit words rather than bytes. So a pointer with value 0 refers to bytes 0 and 1 in a file, and a pointer with value 22 refers to bytes 44 and 45.
Null-terminated strings: These a similar to C-style strings. First, the string is null-padded at the end to make it end on a word boundary. Then it is terminated with a null word, 0x0000. Example: "foo" (0x66,0x6f,0x6f) would be coded as 0x66,0x6f,0x6f,0x00,0x00,0x00. Data of this type is refered to as sz in the tables.
Display data: This appears to be similar in use to a string, but with the ability to include some sort of binary data. For this reason, display data is made by again padding the string (or whatever data) out to a word boundary with nulls, but then prepending a word with the data length in words, excluding the length word itself. Example: "foo" would be coded as 0x00,0x02,0x66,0x6f,0x6f,0x00. Data of this type is refered to as dd in the tables.

File Formats

Standard Database Field Definitions

In the following sections, the fields for each of the standard databases are defined. The following types are used to describe the extra info fields:

sz - Null-terminated string
uint32 - Unsigned 32-bit integer

The primary field is always a null-terminated string, and the access keys are 32-bit pointers to records in child databases which have only a primary field (also a null-terminated string).

audio

#	Name	Type	Description
0	Title	Primary	Title of track
1	Playlist	Access Key	Name of Playlist(s) containing this track
2	Artist	Access Key
3	Album	Access Key
4	Genre	Access Key
5	Recordings	Access Key	Set to "FM Radio" if the track was recorded from the radio and "Microphone" if it was recorded from the microphone.
6	Time	uint32	Length of track in seconds
7	Size	uint32	Size of track in kilobytes
8	Path	sz	Path to track on Neuros filesystem. Follows path conventions specified in the Overview.

pcaudio

#	Name	Type	Description
0	Title	Primary	Title of track
1	Playlist	Access Key	Name of Playlist(s) containing this track
2	Artist	Access Key
3	Album	Access Key
4	Genre	Access Key
5	Recordings	Access Key	Set to "FM Radio" if the track was recorded from the radio and "Microphone" if it was recorded from the microphone.
6	Time	uint32	Length of track in seconds
7	Size	uint32	Size of track in kilobytes
8	Path	sz	Path to track on host PC filesystem.

unidedhisi

#	Name	Type	Description
0	Title	Primary	Usually the name of the file
1	Source	sz	Source of track (Ex: "FM 100.7")
2	Path	sz	Path to track on Neuros filesystem. Follows path conventions specified in the Overview.

idedhisi

#	Name	Type	Description
0	Title	Primary	Title of HiSi clip (not title of actual track)
1	Source	sz	Source of clip (see unidedhisi)
2	Artist	sz
3	Album	sz
4	Genre	sz
5	Track Name	sz	Title of actual track. (Found during song identification)
6	Time	uint32	Length of track in seconds
7	Size	uint32	Size of track in kilobytes
8	Path	sz	Path to track on Neuros filesystem. Follows path conventions specified in the Overview.

failedhisi

Field definitions are the same as unidedhisi.

#	Name	Type
0	Title	Primary
1	Source	sz
2	Path	sz

Maintaining Database Consistency

Reading the database is fairly straightforward; the tricky part is maintaining consistency between all of the parts of the database when modifying it. Failure to do this correctly often leads to unpredictable behavior in the Neuros, and sometimes even causes it to freeze. The follow sections will explain these consistency conditions for various common operations on the database. They assume you are familiar with the format of the various database files.

Adding a Record

To add a new record to a database:

Locate pointers to all of the access keys by searching the appropriate child database for each one. Note that empty access keys should point at the null record in the associated child database.
Any access keys that cannot be found need to be added to the child databases (by recursively following these steps), and these new pointers gathered.
Using these pointers to the access keys, a new record needs to be added to the MDB file.
If this is a child database, a new module in the PAI file should be created.
A new SAI record should be created with pointers to the MDB record created in step 3, and the PAI module in step 4. (Remember the caveat about pointers to PAI modules.)
An entry to the MDB record created in step 3 needs to be added to the PAI module associated with each access key in the child databases. If there is no more room for an additional entry in a PAI module, it may be necessary to extend it.

Extending a PAI Module

Extending a PAI module makes it larger so that it can hold more pointers to parent records:

Move all of the modules after the module being extended down by the size of the extension. Remember that the size of PAI module must be a multiple of the minimum module length.
Fill in the extra space with zeros.
All of the PAI pointers in the associated SAI file will be wrong for modules that are located after this one in the PAI file. Update them to reflect the new locations.

Deleting a Record

There are two ways to delete a record. It can be "marked" as deleted, and it will be ignored by the Neuros (this is what the firmware does when the user requests the a record to be delete):

Set the "isDeleted" flag on the MDB record.
Remove the entire SAI record associated with this record. This must be done by moving the rest of the file up by the size of one record.
If this is a child database, clear the associated PAI module and mark it as empty. The module does not actually have to be removed.
If this database has access keys, then the backlinking pointers in each of the associated PAI modules in the child databases needs to be removed. Removing a pointer from a PAI module requires it to be erased by sliding all of the other entries in that module up. (Not the whole file! The module itself should not change size. Put zeros in the extra spot in the module this creates.)
If an access key in a child database no longer has any entries in its PAI module, that means it is no longer used by any records in the parent database and should be deleted. Follow these same steps again for it.

Complete deletion of a record is much more difficult. It is identical to the above process, but also requires the MDB records and PAI modules to be removed from their respective files. This will invalidate many pointers in the whole database, so it is perhaps easier to rebuild the database from scratch than to attempt to update the contents of the database in place.