Skip to content

Commit

Permalink
ACTER version 1.4 (normalised)
Browse files Browse the repository at this point in the history
  • Loading branch information
AylaRT committed Jul 15, 2020
1 parent 4196075 commit 68aa602
Show file tree
Hide file tree
Showing 540 changed files with 6,346 additions and 7,197 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,5 @@ dmypy.json

# Pyre type checker
.pyre/

.idea
37 changes: 25 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# ACTER Annotated Corpora for Term Extraction Research, version 1.3
# ACTER Annotated Corpora for Term Extraction Research, version 1.4

ACTER is a manually annotated dataset for term extraction, covering 3 languages (English, French, and Dutch),
and 4 domains (corruption, dressage, heart failure, and wind energy).
Expand All @@ -20,14 +20,14 @@ and 4 domains (corruption, dressage, heart failure, and wind energy).
* **Creator**: Ayla Rigouts Terryn
* **Association**: LT3 Language and Translation Technology Team, Ghent University
* **Date of creation version 1.0**: 17/12/2019
* **Date of creation current version 1.3**: 13/05/2020
* **Date of creation current version 1.4**: 15/07/2020
* **Last updated**: 13/05/2020
* **Contact**: ayla.rigoutsterryn@ugent.be
* **Context**: Ayla Rigouts Terryn's PhD project + first TermEval shared task (CompuTerm2020)
* **Shared Task**: see https://termeval.ugent.be; workshop proceedings with overview paper at
https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/COMPUTERM2020book.pdf)
* **Annotation Guidelines**: http://hdl.handle.net/1854/LU-8503113
* **Source**: https://github.com/AylaRT/ACTER.git
* **Source**: https://github.com/AylaRT/ACTER
* **License**: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
(https://creativecommons.org/licenses/by-nc-sa/4.0/)
* **Reference**: Please cite the following Open Access paper if you use this dataset
Expand Down Expand Up @@ -161,23 +161,29 @@ The dataset has been updated since the publication of the former two papers.
These papers also discuss aspects of the data which have not been made available yet,
such as cross-lingual annotations and information on the span of the annotations.

** Number of annotations per corpus**:
**Number of annotations per corpus**:

Domain Language # term annotations # term + Named Entity annotations # Specific Terms # Common Terms # OOD Terms # Named Entities:
* corp en 927 1174 278 642 6 248
* corp en 927 1173 278 642 6 247
* equi en 1155 1575 777 309 69 420
* hf en 2361 2585 1883 319 157 226
* htfl en 2361 2585 1883 319 157 226
* wind en 1091 1534 781 296 14 443
* corp fr 982 1217 300 676 5 236
* equi fr 963 1183 703 234 26 220
* hf fr 2276 2423 1714 504 58 147
* corp fr 979 1207 298 675 5 229
* equi fr 961 1181 701 234 26 220
* htfl fr 2228 2374 1684 487 57 146
* wind fr 773 968 444 308 21 195
* corp nl 1047 1295 310 730 6 249
* equi nl 1395 1546 1023 331 41 151
* hf nl 2077 2257 1561 450 66 180
* equi nl 1393 1544 1022 330 41 151
* htfl nl 2074 2254 1559 449 66 180
* wind nl 940 1245 577 342 21 305


**Normalisation**:

The following normalisation procedures are applied to both the original text files and the annotations:
* unicodedata.normalize("NFC", text)
* normalising all dashes to "-", all single quotes to "'" and all double quotes to '"'


## 6. Updates

Expand Down Expand Up @@ -220,12 +226,19 @@ Domain Language # term annotations # term + Named Entity annotations # Sp
* created Github repository for data + submitted it to CLARIN


**Changes version 1.3 > version 1.4**

* applied limited normalisation on both texts and annotations:
* unicodedata.normalize("NFC", text)
* normalising all dashes to "-", all single quotes to "'" and all double quotes to '"'



## 7. Error Reporting

The ACTER dataset is an ongoing project, so we are always looking to improve the data.
Any questions or issues regarding this dataset may be reported via the Github repository at:
https://github.com/AylaRT/ACTER.git and will be addressed asap.
https://github.com/AylaRT/ACTER and will be addressed asap.



Expand Down
1 change: 0 additions & 1 deletion en/corp/annotations/corp_en_terms_nes.ann
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,6 @@ contractors Common_Term
contracts Common_Term
convention against corruption involving officials Named_Entity
convention on the protection of the european communities' financial interests Named_Entity
convention on the protection of the european communities’ financial interests Named_Entity
convict Common_Term
convicted Common_Term
conviction Common_Term
Expand Down
22 changes: 11 additions & 11 deletions en/corp/texts/annotated/corp_en_01.txt
Original file line number Diff line number Diff line change
Expand Up @@ -53,30 +53,30 @@ Other actions are also punishable. Thus Article 314 of the Criminal Code punishe
The tax legislation also specifies that each count of corruption, liable to prosecution under the Criminal Code, both for a natural and for a legal person, cannot be deducted from the basis of tax assessment. Ex officio, a separate amount of 309 % of the value of the count of corruption is levied on the corporate taxpayer.


Plus the statutory additional tenths. The fine can either be imposed cumulatively with the penalty of imprisonment, or separately.
Plus the statutory 'additional tenths'. The fine can either be imposed cumulatively with the penalty of imprisonment, or separately.

Article 8 of the Law of 10 February 1999 concerning Penalties for Corruption, Moniteur Belge, 23 March 1999.

Article 11 of the Royal Decree of 23 November 2007 in amendment of the Law of 24 December 1993 on Government Contracts and Certain Contracting of Works, Supplies and Services and Certain Royal Decrees Implementing this Law, Moniteur Belge, 7 December 2007.

Plus statutory additional tenths.
Plus statutory 'additional tenths'.

Article 219 of the 1992 Income Tax Law.

What can companies do?

Corruption is one of the main barriers to companies operating on foreign markets. Depending on the sector and the country, there is a chance of coming into contact with corrupt behaviour sooner or later. It is therefore as well to be properly prepared for it. This brochure offers a number of tips on how to avoid and counter corrupt practices.

Codes of conduct forbid corruption, irrespective of its intended purpose. Therefore the scope of this prohibition is wider than winning or keeping a market. It applies not only to commission payments to officials, but also to failure to mention such payments in subcontracts, consultancy agreements, contracts of technical assistance abroad, etc.
Codes of conduct forbid corruption, irrespective of its intended purpose. Therefore the scope of this prohibition is wider than winning or keeping a market. It applies not only to commission payments to officials, but also to failure to mention such payments in subcontracts, consultancy agreements, contracts of 'technical assistance' abroad, etc.

Payments to local agents must be limited to settlement for lawfully provided services.

The auditing of corporate financial statements must avoid non-transparent or off-books accounting. Compliance with the rules on funding of political parties is required, and companies managerial or controlling organs must be informed of accounting irregularities detected.
The auditing of corporate financial statements must avoid non-transparent or off-books accounting. Compliance with the rules on funding of political parties is required, and companies' managerial or controlling organs must be informed of accounting irregularities detected.

The main measures constituting good management can be set forth in codes of conduct. Nevertheless, the best signal to a companys staff is the example set by its managers.
The main measures constituting good management can be set forth in codes of conduct. Nevertheless, the best signal to a company's staff is the example set by its managers.


Besides, sometimes it is difficult to distinguish between corruption and normal business contacts. Grey areas do exist, in which special attention must be paid to resolving awkward situations. Trading with Saddam Husseins Iraq was highly profitable, but could also have serious consequences, both for the companys reputation and in criminal law. Moreover, commission had to be paid to Iraqi officials and their intermediaries.
Besides, sometimes it is difficult to distinguish between corruption and normal business contacts. Grey areas do exist, in which special attention must be paid to resolving awkward situations. Trading with Saddam Hussein's Iraq was highly profitable, but could also have serious consequences, both for the company's reputation and in criminal law. Moreover, commission had to be paid to Iraqi officials and their intermediaries.

Belgium punishes the bribing of both foreign officials and people acting for a Belgian government department. The OECD (Organisation for Economic Cooperation and Development) carries out assessments to ensure that national governments really implement the 1997 anti-bribery convention. Regrettably, too few companies have so far followed these good practices.

Expand All @@ -90,7 +90,7 @@ My Purchasing Director has concluded a secret agreement with a supplier, who is

I know that the leaders of a certain country cream something off payments for the supply of commodities. Should I co-operate with my colleagues from other countries to show them that these practices belong to the past? No-one obliges me to do business with them.

I go to a sunny destination (Madrid/Rome/San Francisco) with the director of a public institution for my latest product launch. We spend the whole week on the tests. Why dont I invite him/her to view my new equipment in the nearest Belgian factory?
I go to a sunny destination (Madrid/Rome/San Francisco) with the director of a public institution for my latest product launch. We spend the whole week on the tests. Why don't I invite him/her to view my new equipment in the nearest Belgian factory?

You should also bear in mind some possible penalties, in addition to criminal prosecution, which you may incur if you act this way.

Expand Down Expand Up @@ -147,7 +147,7 @@ What is Belgium doing against corruption?

At international level

The internationalization of trade and the expansion of the European Union have led to growing realization that corruption is a phenomenon which needs to be tackled internationally. That is why there is support for Community rules which adopt a cross-border approach to corruption. This means that penalties are no longer limited to ones own national officials and/or persons and companies, but also apply to corruption of foreign officials and/or persons and companies.
The internationalization of trade and the expansion of the European Union have led to growing realization that corruption is a phenomenon which needs to be tackled internationally. That is why there is support for Community rules which adopt a cross-border approach to corruption. This means that penalties are no longer limited to one's own national officials and/or persons and companies, but also apply to corruption of foreign officials and/or persons and companies.


Belgian companies which are engaged on the international market must therefore be aware that corrupt behaviour will not be tolerated either in domestic or international trade. They have to realize that they may be held criminally liable for corrupt practices abroad as well.
Expand All @@ -164,20 +164,20 @@ The 1999 Council of Europe Criminal Law Convention on Corruption contains provis

The Council of Europe Civil Law Convention on Corruption, likewise of 1999, deals with the civil aspects of corruption. Belgium ratified this Convention in 2007.

The 1997 OECD Convention on Combating Bribery of Foreign Public Officials in International Business Transactions is very important to Belgiums private sector. The aim of the OECD Convention is to create a common framework and a level playing field for companies to compete in the countries which ratify the Convention. The Convention focuses on active corruption. It penalizes people who bribe a foreign official, including in countries which are not party to the Convention. There is also a strict evaluation mechanism within the OECD, which checks the countries for implementation and enforcement of the Convention.
The 1997 OECD Convention on Combating Bribery of Foreign Public Officials in International Business Transactions is very important to Belgium's private sector. The aim of the OECD Convention is to create a common framework and a level playing field for companies to compete in the countries which ratify the Convention. The Convention focuses on active corruption. It penalizes people who bribe a foreign official, including in countries which are not party to the Convention. There is also a strict evaluation mechanism within the OECD, which checks the countries for implementation and enforcement of the Convention.

The OECD Guidelines for Multinational Enterprises are recommendations aimed at companies with a view to socially responsible action. They contain a number of voluntary principles and standards for responsible behaviour in business relations.


The European Union has passed two directives (2004/17/EC and 2004/18/EC) on combating corruption and the consequences of corruption. Businesses which are convicted of bribery can be debarred from participation in public invitations to tender in other countries. A number of EU Member States have already set up blacklists in relation to public contracts.

Other international regulations and initiatives or foreign legislation may also be important to Belgian companies. Noteworthy initiatives in this regard are the World Banks Collective Action initiative (www.fightingcorruption.org), the 1996 WTO Agreement on Government Procurement (GPA) and the USAs Foreign Corrupt Practices Act. Belgian companies may be subject to the latter US legislation if, for example, they co-operate with an American company as part of a joint venture.
Other international regulations and initiatives or foreign legislation may also be important to Belgian companies. Noteworthy initiatives in this regard are the World Bank's Collective Action initiative (www.fightingcorruption.org), the 1996 WTO Agreement on Government Procurement (GPA) and the USA's Foreign Corrupt Practices Act. Belgian companies may be subject to the latter US legislation if, for example, they co-operate with an American company as part of a joint venture.

At national level
Since the late 1990s, Belgium has paid increasing attention to its anti-corruption policy. A number of national corruption scandals, and inadequate legal means of combating them, have forced Belgium to revise its penalties for bribery. International pressure has also been brought to bear. Hence Belgium has ratified the conventions at UN and EU levels, and signed and ratified those of the Council of Europe and OECD. Of course, signing and ratifying such conventions are not window-dressing: they require Belgium to fulfil a number of obligations.


In the late 1990s Belgium passed the Law against Bribery. It also introduced criminal liability of legal persons. Since then, Belgium has had firm legislation against both public and private bribery. It has created the opportunity to prosecute and convict legal as well as natural persons. These two laws represented the first important steps in Belgiums anti-corruption policy.
In the late 1990s Belgium passed the Law against Bribery. It also introduced criminal liability of legal persons. Since then, Belgium has had firm legislation against both public and private bribery. It has created the opportunity to prosecute and convict legal as well as natural persons. These two laws represented the first important steps in Belgium's anti-corruption policy.

Belgium has its own, specialized anti-corruption department within the Federal Criminal Investigation Police. This is the Central Office for the Repression of Corruption, answerable to the Directorate for Combating Economic and Financial Crime. The Central Office is authorized to investigate, and support the investigation of, malpractices detrimental to the interests of the state. This includes serious and complex misdemeanours involving corruption. The Central Office also fulfils a pilot role in the context of combating criminal abuses and misconduct with regard to government contracts, subsidy laws, recognitions and licensing. Its brief also extends to private corruption. The Central Office monitors the phenomenon of corruption to gain the clearest possible picture of it. The Belgian Federal Government also maintains regular contacts with counterpart foreign or international bodies, including OLAF, the European Commission's anti-fraud office.

Expand Down
Loading

0 comments on commit 68aa602

Please sign in to comment.