Last update: 25 November 2006
export BUFR_TABLES=<installation-dir>/tables
bufrInfo [-3] [-4] filename
bufrDelSec3 [-z] inFilename outFilename
bufrTable sixDigitDescriptor
bufrFilterGeo [-w west] [-e east] [-s south] [-n north] inFilename outFilename
(See examples of their output)
Last update: 25 November 2006
As you might already know, BUFR is a "table-oriented" format. Practically, this means that to decode a file containing some bufr records, you have to look up their keys from external tables (the so-called tables A, B and D).
So, these programs have to be informed about the directory where the A, B and D tables
reside. This is set by the environment variable BUFR_TABLES
. If this is not set,
the programs will look for them in directory <current working dir>/tables
So before running the programs, do something like this, assuming you have placed the
downloaded dir as /opt/meteo/bufr-1.1:
export BUFR_TABLES=/opt/meteo/bufr-1.1/tables
or better, put that command in your initialization file ~/.bashrc
(csh users, use the csh-syntax setenv BUFR_TABLES /opt/meteo/bufr-1.1/tables
or add it in your ~/.login
file)
This is the main program. It will give information per each bufr record in a bufr file in one of three levels of detail:
The "terse info" mode needs no cmd-line params. The "descriptor-info" mode is requested with the -3 cmd-line option. The "data-info" needs the -4 cmd-line option. Finally, after cmd-line options, you give the name of the file you want to examine. So, the full syntax is ( [ ] means optional):
bufrInfo [-3] [-4] filename
The -3 and -4 options can be combined to give the most detailed information (both
full-descriptor-expansion of section-3 and data decoding of section-4), that is, you
may give bufrInfo -34 filename
As you might know, there is an optional section in each bufr record, the section-2 This optional section, contains custom information, usually with no value for anyone other the met-center that produced the record-data.
The files I came across, had section-2 weighting about 23% of the total size, so I found it useful to strip them away (other than storage-gain, it makes later processing faster)
This is what the bufrDelSec2
utility does. It opens an input file, reads each of
the bufr records it contains, strips them from their section-2 (if they have one) and writes
them in a new output file. If a -z option is given in cmd line, then the centre-id will be
additionally zero-ed out for bufr-records that have a section-2. So, its syntax is:
bufrDelSec2 [-z] inFile outFile
Note, that if you give the -z option, the centre-id won't be zero-ed out for all bufr records; only for those that have a section-3 (which will be removed)
Many times I had to look up how a table D sequence-descriptor in section-3 expands to its individual table B element-descriptors. At the end, I got the idea. Why go through a file looking up numbers recursively, when the pc can do it? You give a 6-digit sequence descriptor as the only cmd-line arg to program bufrTable and it prints back a human readable full expansion of it. Its syntax is:
bufrTable sixDigitDescriptor
for example, bufrTable 301033
will expand the sequence descriptor 3.01.033
A bufr-format file can contain multiple bufr records, with reported information from many geographical points. You may need only these that fall within a certain area (for example, because a program that uses them does not filter them itself, and the extra data cause a noticable delay). The bufrFilterGeo does geographical filtering. It will accept up to four geographical boundaries (west east south north), an input and an output file. It will evety bufr-record from the input file, and will put in the output file only the ones whose data are of a point within the area you asked. It's suntax is:
bufrFilterGeo [-w west] [-e east] [-s south] [-n north] inFile outFile
Those boundaries that are not defined in cmd line with one of the -e -w -s -n options,
are set internally to their extreme (i.e south=-90, north=90, west=-180, east=180)
For example: bufrFilterGeo -s -5 -n 5.5 earth.bfr equator.bfr
As you (should) know you need 3 external tables to lookup the keys of a bufr record, the tables A, B and D (table A is mostly "ornamental", it is not really required) So, you would expect to find these three tables somewhere. Indeed, they are placed in the tables subdirectory (and located by the programs using the BUFR_TABLES environment variable). If you look there, you will find three files, aptly named as tableA.txt, tableB.txt and tableD.txt (which you may modify, e.g. to change a human-text description)
But you will also see many other files, with cryptic names like B0000000000098000000.TXT or D0000000000098002001.TXT. What are they?
These are the tableB and tableD files freely distributed by ECMWF (European Centre for Medium-Range Weather Forecasts), along with their Fortran 77 bufr-handling API. They don't use a single B or D table but various (sub)versions of them. Since my met-service relies heavily on ECMWF products, I have adjusted my programs to read the expected table (sub)version when the bufr records are produced by ECMWF (that is, when their centre-identifier is 98). In fact, the "default" tableB.txt and tableD.txt are mere copies of the ECMWF-distributed ones.
Now, if you have to "mimic" the ECMWF's behaviour and use "versioned" tables other than the defaults tableB.txt and tableD.txt, you have to add your extra tables to the tables directory (using the same format as the existing ones), modify the 12-line function "BufrSection1::getTablePath(const char kind) const" in Bufr.cpp and recompile (you need less-than-average c++ programmer skills to do the modification)
And a final note of the format of the textual tables. The tables A and B layout is as self-describing as it can be. The table D layout though, may not that evident at first glance (but is at second). It has blocks like these:
301033 5 001005
002001
301011
301012
301021
It means that the sequence descriptor 3.01.033 expands to 5 descriptors, which are these five ones: 0.01.005, 0.02.001, 3.01.011, 3.01.012 and 3.01.021
If you think this is strange, I agree. But remember that these tables came from ECMWF and maybe this is the easiest format to read them using Fortran 77. If you wonder why I haven't changed them into something more sane (like all descriptors in one line, meaning first expands to the ones following it), this is the reason: as I have already said, we are heavy users of ECMWF products. I didn't want to have to remember to run a conversion utility whenever ECMWF updates it tables
a. table C descriptors
If you run gribInfo enough times, you will notice that it won't decode tableC
descriptors other than 2.05.Y (character data) and 2.06.Y (custom data length). In fact,
section-4 decoding will stop when the first section-3 tableC descriptor is encountered.
This is done for two reasons:
First, the purpose of bufrInfo is to peek at the data themselves, not the extra information that table C usually defines.
Second, it is matter of personal (and possibly misguided) taste: the tableC-descriptors are the ugliest thing I've ever seen; they break down the table-oriented nature of bufr, as any newly-defined one, may modify the bit-steam in such a twisted way as to make all previous software read it wrong. In my opinion, if you accept table C, then every custom binary stream is table-oriented. You just have to declare one table C descriptor explaining how to read it! So, it might be clear that I won't add C-table support unless I'm really desperate to do so!
b. compressed data
All programs will ignore bufr records with bufr-compression. This is not like table C; I
have nothing against it, I just haven't come across compressed bufr records, so I didn't
think this was a high priority to implement it.
comments to: andreadis@hnms.gr