Problem statement & Requirements
I'm working with a complex C++ simulation that requires a large number of user-specified parameters. Both speed and readability are important. I'd like to define all possible parameters in one (and only one) place, and include sensible defaults that can be easily over-ridden. Finally, intelligent type-handling would be nice.
For convenience, I decided to wrap the C++ simulation in python setup/glue code. Python is a logical choice here as the "available everywhere" glue language that has nice standard libraries.
Available libraries
There aren't many data-passing options that work with both C++ and python. Libconfig, JSON, XML, and Google Protocol Buffers (PB) appear to be the only reasonable options. Here's my thoughts on the first three:
- Libconfig: Nice clean library, good language support. The big downside is that data structures must be defined both in a data file and in code - e.g. data is "moved" from a file into C++ variables. I feel like libconfig is best for a small number of complex variables, like lists and vectors.
- JSON: no clear standard C++ library, library docs so-so, speed complaints from some?
- XML: Massive overkill.
That leaves PB, which has
nice docs for both C++ and python. All the variables, along with their types and defaults, are defined in a
.proto file. The
protoc tool auto-generates python and C++ code from the .proto file. By adding it to my Makefile, C++ classes are autogenerated at compile time. This makes for fast and readable C++ code - like using a named dict, but without the speed costs.
Solution / Workflow
I'm using python to read user-supplied values into a set of PB messages, and then serializing the messages to files. C++ then reads the messages from those files at runtime. A python script run by
make synchronizes the locations of files between python and C++.
I also want to process commandline options for my python wrapper script. Happily, I can hand a PB message to
python's parser.parse_args() and have it
set PB message attributes with setattr().
The last python step (aside from writing the message to disk) is reading "variable,value" pairs from a .csv file. If a variable has already been set by parse_args, I skip it: the commandline values override .csv file values.
Summary
Overall, PB makes a very nice data coupler between an interpreted language like python and a compiled language like C++. Python excels at text processing and is easy to prototype, while C++ is fast and beautiful.
PB has a few side-benefits. On the C++ side, it provides some natural namespace encapsulation to manage variable-explosion. Runtime inspection with gdb is easy enough.
Finally, storing all the options values used to run each simulation in a standard-format file is handy - it allows tests to re-run the simulation with exactly the same inputs.
Python Snippets
def main():
## initialize protobuf, fill with ParseArgs
setupSim = ProtoBufInput_pb2.setupSim()
setupSim = ParseArgs(sys.argv[1:], setupSim)
prepInput(setupSim)
RunSim()
def ParseArgs(argv, setupSim):
parser = OptionParser(usage="wrapper.py [options]\nNote: commandline args over-ride values in files.", version=setupSim.version)
## these must be valid protocol buffer fields
parser.add_option("-t", "--test", dest="testCLI",
action='store_true', help="Run test suite")
parser.add_option("-d", "--days", metavar='N',
dest="number_of_days",
type='int', help="Number of days to simulate")
## parse!
(setupSim, args) = parser.parse_args(argv, values=setupSim)
return(setupSim)
def prepInput(setupSim):
## options from ParseArgs
inhandle = open(setupSim.file_options, 'r')
outhandle = open(ProtoDataFiles.PbFile_setupSim, 'wb')
reader = csv.reader(inhandle, delimiter=',')
header = reader.next()
if not (header == ['variable','value']):
raise Exception('Incorrect header format')
for row in reader:
## skip comments, check for 2 fields per row
if (row[0][0] == '#'):
continue
if not (len(row) == 2):
raise Exception('Problem with value pair: %s' % row)
## pack the message using text representation
msgText = '%s : %s' % (row[0], row[1])
if setupSim.HasField(row[0]):
print("Skipping config file, keeping commandline value: %s, %s" % (row[0], getattr(setupSim,row[0])))
continue
setupSim = Merge(msgText, setupSim)
## write out to file for C++ to read
outhandle.write(setupSim.SerializeToString())
outhandle.close()
def RunSim():
subprocess.Popen("./sim").communicate()
if __name__ == "__main__":
main()
C++ Code Snippets
//PbRead.h
#include
#include
#include
#include
#include "proto/ProtoBufInput.pb.h"
template
void PbRead(Type &msg, const char *filename){
std::fstream infile(filename, std::ios::in | std::ios::binary);
if (!infile) {
throw std::runtime_error("Setup message file not found");
} else if (!msg.ParseFromIstream(&infile)) {
throw std::runtime_error("Parse error in message file");
}
}
// sim.cpp
#include "PbRead.h"
#include "ProtoDataFiles.h"
// protocol buffers get passed around, are globals
ProtoBufInput::setupSim PbSetupSim;
int main(int argc,char **argv)
{
GOOGLE_PROTOBUF_VERIFY_VERSION;
PbSetupSim.set_init(true);
// #define PbFile_setupSim "filename" in ProtoDataFiles.h, written by make
PbRead(PbSetupSim, PbFile_setupSim);
//...
if (PbSetupSim.test_2()){
//...
}
}