Manual of bufrInfo (and utilities)

Quick reference

Last update: 25 November 2006

export BUFR_TABLES=<installation-dir>/tables
bufrInfo [-3] [-4] filename
bufrDelSec3 [-z] inFilename outFilename
bufrTable sixDigitDescriptor
bufrFilterGeo [-w west] [-e east] [-s south] [-n north] inFilename outFilename

(See examples of their output)

Detailed manual

Last update: 25 November 2006

the external tables location
the bufrInfo program
the bufrDelSec2 utility program
the bufrTable utility program
the bufrFilterGeo filtering utility
about the A, B and D tables
Limitations (to-do list?)

1. the external tables location

As you might already know, BUFR is a "table-oriented" format. Practically, this means that to decode a file containing some bufr records, you have to look up their keys from external tables (the so-called tables A, B and D).

So, these programs have to be informed about the directory where the A, B and D tables reside. This is set by the environment variable BUFR_TABLES. If this is not set, the programs will look for them in directory <current working dir>/tables
So before running the programs, do something like this, assuming you have placed the downloaded dir as /opt/meteo/bufr-1.1:

export BUFR_TABLES=/opt/meteo/bufr-1.1/tables

or better, put that command in your initialization file ~/.bashrc
(csh users, use the csh-syntax setenv BUFR_TABLES /opt/meteo/bufr-1.1/tables or add it in your ~/.login file)

2. the bufrInfo program

This is the main program. It will give information per each bufr record in a bufr file in one of three levels of detail:

the "terse info": it will show the unexpanded sequence descriptors of section-3 (along with the generic info from section-1, like data-category, sizes, date etc)
the "descriptor info": It will fully expand the section-3 sequence descriptors, so you can see exactly what individual fields are contained in each bufr record
the "data info": it will expand the data of section-4, so you can see the actual values of each field of the bufr records

The "terse info" mode needs no cmd-line params. The "descriptor-info" mode is requested with the -3 cmd-line option. The "data-info" needs the -4 cmd-line option. Finally, after cmd-line options, you give the name of the file you want to examine. So, the full syntax is ( [ ] means optional):

bufrInfo [-3] [-4] filename

The -3 and -4 options can be combined to give the most detailed information (both full-descriptor-expansion of section-3 and data decoding of section-4), that is, you may give bufrInfo -34 filename

3. the bufrDelSec2 utility program

As you might know, there is an optional section in each bufr record, the section-2 This optional section, contains custom information, usually with no value for anyone other the met-center that produced the record-data.

The files I came across, had section-2 weighting about 23% of the total size, so I found it useful to strip them away (other than storage-gain, it makes later processing faster)

This is what the bufrDelSec2 utility does. It opens an input file, reads each of the bufr records it contains, strips them from their section-2 (if they have one) and writes them in a new output file. If a -z option is given in cmd line, then the centre-id will be additionally zero-ed out for bufr-records that have a section-2. So, its syntax is:

bufrDelSec2 [-z] inFile outFile

Note, that if you give the -z option, the centre-id won't be zero-ed out for all bufr records; only for those that have a section-3 (which will be removed)

4. the bufrTable utility program

Many times I had to look up how a table D sequence-descriptor in section-3 expands to its individual table B element-descriptors. At the end, I got the idea. Why go through a file looking up numbers recursively, when the pc can do it? You give a 6-digit sequence descriptor as the only cmd-line arg to program bufrTable and it prints back a human readable full expansion of it. Its syntax is:

bufrTable sixDigitDescriptor

for example, bufrTable 301033 will expand the sequence descriptor 3.01.033

5. the bufrFilterGeo filtering utility

A bufr-format file can contain multiple bufr records, with reported information from many geographical points. You may need only these that fall within a certain area (for example, because a program that uses them does not filter them itself, and the extra data cause a noticable delay). The bufrFilterGeo does geographical filtering. It will accept up to four geographical boundaries (west east south north), an input and an output file. It will evety bufr-record from the input file, and will put in the output file only the ones whose data are of a point within the area you asked. It's suntax is:

bufrFilterGeo [-w west] [-e east] [-s south] [-n north] inFile outFile

Those boundaries that are not defined in cmd line with one of the -e -w -s -n options, are set internally to their extreme (i.e south=-90, north=90, west=-180, east=180)
For example: bufrFilterGeo -s -5 -n 5.5 earth.bfr equator.bfr

6. about the A, B and D tables

As you (should) know you need 3 external tables to lookup the keys of a bufr record, the tables A, B and D (table A is mostly "ornamental", it is not really required) So, you would expect to find these three tables somewhere. Indeed, they are placed in the tables subdirectory (and located by the programs using the BUFR_TABLES environment variable). If you look there, you will find three files, aptly named as tableA.txt, tableB.txt and tableD.txt (which you may modify, e.g. to change a human-text description)

But you will also see many other files, with cryptic names like B0000000000098000000.TXT or D0000000000098002001.TXT. What are they?

These are the tableB and tableD files freely distributed by ECMWF (European Centre for Medium-Range Weather Forecasts), along with their Fortran 77 bufr-handling API. They don't use a single B or D table but various (sub)versions of them. Since my met-service relies heavily on ECMWF products, I have adjusted my programs to read the expected table (sub)version when the bufr records are produced by ECMWF (that is, when their centre-identifier is 98). In fact, the "default" tableB.txt and tableD.txt are mere copies of the ECMWF-distributed ones.

Now, if you have to "mimic" the ECMWF's behaviour and use "versioned" tables other than the defaults tableB.txt and tableD.txt, you have to add your extra tables to the tables directory (using the same format as the existing ones), modify the 12-line function "BufrSection1::getTablePath(const char kind) const" in Bufr.cpp and recompile (you need less-than-average c++ programmer skills to do the modification)

And a final note of the format of the textual tables. The tables A and B layout is as self-describing as it can be. The table D layout though, may not that evident at first glance (but is at second). It has blocks like these:


 301033  5 001005
           002001
           301011
           301012
           301021

It means that the sequence descriptor 3.01.033 expands to 5 descriptors, which are these five ones: 0.01.005, 0.02.001, 3.01.011, 3.01.012 and 3.01.021

If you think this is strange, I agree. But remember that these tables came from ECMWF and maybe this is the easiest format to read them using Fortran 77. If you wonder why I haven't changed them into something more sane (like all descriptors in one line, meaning first expands to the ones following it), this is the reason: as I have already said, we are heavy users of ECMWF products. I didn't want to have to remember to run a conversion utility whenever ECMWF updates it tables

7. Limitations (to-do-list?)

a. table C descriptors
If you run gribInfo enough times, you will notice that it won't decode tableC descriptors other than 2.05.Y (character data) and 2.06.Y (custom data length). In fact, section-4 decoding will stop when the first section-3 tableC descriptor is encountered. This is done for two reasons:

First, the purpose of bufrInfo is to peek at the data themselves, not the extra information that table C usually defines.

Second, it is matter of personal (and possibly misguided) taste: the tableC-descriptors are the ugliest thing I've ever seen; they break down the table-oriented nature of bufr, as any newly-defined one, may modify the bit-steam in such a twisted way as to make all previous software read it wrong. In my opinion, if you accept table C, then every custom binary stream is table-oriented. You just have to declare one table C descriptor explaining how to read it! So, it might be clear that I won't add C-table support unless I'm really desperate to do so!

b. compressed data
All programs will ignore bufr records with bufr-compression. This is not like table C; I have nothing against it, I just haven't come across compressed bufr records, so I didn't think this was a high priority to implement it.

comments to: andreadis@hnms.gr