Storing C/C++ structures/objects

Berkeley DB can store any kind of data, that is, it is entirely 8-bit clean. How you use this depends, to some extent, on the application language you are using. In the C/C++ languages, there are a couple of different ways to store structures and objects.

First, you can do some form of run-length encoding and copy your structure into another piece of memory before storing it:

struct {
	char *data1;
	u_int32_t data2;
	...
} info;
size_t len;
u_int8_t *p, data_buffer[1024];

p = &data_buffer[0];
len = strlen(info.data1);
memcpy(p, &len, sizeof(len));
p += sizeof(len);
memcpy(p, info.data1, len);
p += len;
memcpy(p, &info.data2, sizeof(info.data2));
p += sizeof(info.data2);
...

and so on, until all the fields of the structure have been loaded into the byte array. If you want more examples, see the Berkeley DB logging routines (for example, btree/btree_auto.c:__bam_split_log()). This technique is generally known as "marshalling". If you use this technique, you must then un-marshall the data when you read it back:

struct {
	char *data1;
	u_int32_t data2;
	...
} info;
size_t len;
u_int8_t *p;

p = &data_buffer[0];
memcpy(&len, p, sizeof(len));
p += sizeof(len);
info.data1 = malloc(len);
memcpy(info.data1, p, len);
p += len;
memcpy(&info.data2, p, sizeof(info.data2));
p += sizeof(info.data2);
...

and so on.

The second way to solve this problem only works if you have just one variable length field in the structure. In that case, you can declare the structure as follows:

struct {
	int a, b, c;
	u_int8_t buf[1];
} info;

Then, let's say you have a string you want to store in this structure. When you allocate the structure, you allocate it as:

malloc(sizeof(struct info) + strlen(string));

Since the allocated memory is contiguous, you can the initialize the structure as:

info.a = 1;
info.b = 2;
info.c = 3;
memcpy(&info.buf[0], string, strlen(string) + 1);

and give it to Berkeley DB to store, with a length of:

sizeof(struct info) + strlen(string);

In this case, the structure can be copied out of the database and used without any additional work.