-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
197 lines (138 loc) · 6.97 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
This directory contains the open source PET system. For license details
please refer to the LICENSE file.
Overview
========
PET consists of two major parts:
1) The `flop' preprocessor. It compiles a given grammar in TDL
format into a binary form usable by the runtime system.
2) The `cheap' parser. It parses input with respect to a grammar in the
binary form produced by `flop'.
There are no binaries provided due to the vast amount of different
distributions around. You have to first compile the package to be able to
run it (see below).
Grammars
========
Two freely available open source grammars that are compatible with PET
are the ERG (English, available from http://lingo.stanford.edu/ftp/erg.tgz,
or http://lingo.stanford.edu) and JACY (Japanese,
http://www.dfki.de/~siegel/grammar-download/JACY-grammar.html).
If you want to contribute a grammar, or if you are looking for more
grammars please have a look at www.delph-in.net
Compiling PET
=============
In order to compile the source code, you first have to make sure that all
required and optional libraries as well as the corresponding header files
are installed on your system (see below under section `Dependencies').
PET uses the GNU Build System (aka GNU Autotools). GCC/g++ version > 3.1.2
is known to work fine.
If you have extracted a tarfile version, change into the directory
containing the `configure' script and type:
./configure --help
Have a look at the available options, and configure your system as needed.
If configure has been executed successfully, run `make' to build the
software, and finally `make install' to install it (depending on the
install paths, which can be influenced with `configure', this might
require administrator privileges).
Advanced issues:
* Non-standard preprocessor, compiler or linker options can be passed to
configure as command line arguments (cf. ./configure --help).
For example, the following line configures PET for best performance:
./configure --disable-assert "CXXFLAGS=-g -O3"
The following line configures PET for the use with a debugger:
./configure "CXXFLAGS=-g -O0"
The following line configures PET for profiling with gprof and gcov
(cf. gcc man page):
./configure "CXXFLAGS=-pg -fprofile-arcs -ftest-coverage"
* The root build directory is the directory where `configure' is called.
That is, PET can be built in any directory by changing into that
directory and calling `configure' from there. For example:
mkdir -p build/debug
cd build/debug
../../configure "CXXFLAGS=-g -O0"
All Makefiles will be created in the directory where `configure' is called.
The advantage of having different build directories is that many
configurations can be tested and maintained at the same time, and that the
binary files are kept separate from the source files.
* A tarball distribution can be built and checked with `make distcheck'.
Use the DISTCHECK_CONFIGURE_FLAGS environment variable to pass any
parameters to configure. Example:
export DISTCHECK_CONFIGURE_FLAGS="--with-xml"
make distcheck
Bootstrapping PET from SVN
==========================
If you checked out your first SVN snapshot, you first have to run a current
autoconf and autoheader (both >=2.59), aclocal and automake (both >= 1.7) in
order to create the build system. Change into the directory containing
`configure.ac' and type:
autoheader && aclocal -I m4 && automake -a && autoconf
or simply
autoreconf -i
Having run these commands, proceed with configuration as described above.
Dependencies
============
Most current GNU/Linux distributions provide packages for the required and
optional libraries. Installing such packages usually requires administrator
privileges. Note that the header files are often distributed in separate
packages, usually ending in `-dev' or `-devel' (e.g. `libboost-dev'). Make
sure that these packages are installed, too.
If you install your own libraries in non-standard directories, make sure
that all these shared libraries and their dependencies can be found when
PET's configure is run. There are several ways to achieve this (e.g., using
ldconfig or the LD_LIBRARY_PATH environment variable, cf. the Program Library
HOWTO). The dependencies of a program or library can be inspected with ldd.
http://tldp.org/HOWTO/Program-Library-HOWTO/.
Required:
Boost provides free portable C++ libraries for various applications.
flop and cheap both use the Boost Graph library.
Current GNU/Linux distributions usually provide this library as a pre-built
package. Make sure that the header files are installed, too (see above).
http://www.boost.org/
Optional:
If you want UniCode support in PET, you need the ICU package freely
available from IBM.
Current GNU/Linux distributions usually provide this library as a pre-built
package. Make sure that the header files are installed, too (see above).
http://icu.sourceforge.net/download/
Install [incr tsdb()] for creating performance profiles of the grammar and
the system, and for parsing corpora.
http://www.delph-in.net/itsdb/
For the MRS output and FSPP preprocessor modules, you need a version of
Embeddable Common Lisp (ECL for short).
ATTENTION: The version that is known to work at the moment is 0.9h, while
0.9i seems to have problems. While some GNU/Linux distributions provide the
library as a pre-built package, this might be a newer version or it might
be compiled with the wrong options. In this case, you need to install it
yourself, which might also be problematic (e.g. on Ubuntu Linux 7.04,
ecl 0.9h does not compile without some modifications).
http://ecls.sourceforge.net/
For the XML input modes you need the Apache Xerces C++ library.
Current GNU/Linux distributions usually provide this library as a pre-built
package. Make sure that the header files are installed, too (see above).
http://xml.apache.org/xerces-c/
For unit tests you can optionally use cppunit. Note that so far only small
parts of the system take advantage of this.
Current GNU/Linux distributions usually provide this library as a pre-built
package. Make sure that the header files are installed, too (see above).
http://cppunit.sourceforge.net/
Doxygen compatible documentation is included in most header files and some
of the source files. Call 'make doc' in the root build directory to build
the API documentation.
Current GNU/Linux distributions usually provide this program as a pre-built
package.
http://www.doxygen.org/
Layout of the sources
=====================
common/ contains the sources shared between `flop' and `cheap'.
flop/ contains the sources for the preprocessor.
cheap/ contains the sources for the parser.
fspp/ lisp-based input preprocessor module for cheap. Very slow.
goofy/ contains the sources for an outdated GUI prototype. This hasn't been
compiled for two years and probably won't work.
Contact
=======
For bug reports or feature requests, check the ticket system at
http://pet.opendfki.de
For questions and comments, please have a look at the mailing list
pet@delph-in.net
or contact me directly at
kiefer@dfki.de