Dear list members,
After having thanked the people who helped me with my query regarding
"Parallel corpora and French software", here is now a sunmmary of the
results I obtained:
* software that I could use to tag/analyse my French data
Michael Barlow is currently developing ParaConc.
<The new version will be based on
<the code from MonoConc Pro and will be similar in functionality (but
<more functions) to the one that you are using, [ParaConc, 1995], but
the <underlying code will be different.
(Maria José Ribeiro <mj.ribeiro@NETC.PT>)
* tagger/concordancer which would enable me to retrieve
of the French subjunctive
Cordial 6 Universités a a tagger/lemmatizer for French which does it:
1 Il il PPER3S
2 faut falloir VINDP3S
3 que que SUB
4 je je PPER1S
5 vienne venir VSUBP1S
6 . . PCTFORTE
(Jean Veronis, http://www.up.univ-mrs.fr/~veronis)
For more information, contact SYNAPSE Développement
* gather a French/English parallel corpus (with the texts being
aligned if possible).
<ARCADE corpus of ca. 1.5M words of Fr/En texts aligned at sentence
<The corpus is distributed by ELRA:
(Jean Veronis, firstname.lastname@example.org)
Tim Johns' website: http://web.bham.ac.uk/johnstf/timconc.htm
<He's been working on parallel concordancing within the Lingua
<project on multilingual parallel concordancing. I'm not
<quite sure whether you'll find actual corpora there, but
<there may be something, plus probably useful links.
(Antoine Consigny, email@example.com, firstname.lastname@example.org)
Two corpora, primarily political and legislative in their content.
available from the LDC:
<UN Parallel Text (English/Spanish/French)
<-- you can request just the English and French data, if you
<prefer; the full corpus is a 3-cdrom set, with one language per
<cdrom, one text document per data file, and alignment at the level
<of document/file only.
<Canadian Hansards (French/English)
<-- a single cdrom containing
<two distinct sets of parallel text; one set is aligned at the
<sentence level, and the other (smaller) set is aligned at the
<paragraph level (with additional alignment data for individual
<word tokens within paragraphs).
Please write to email@example.com if you would like further
information or are interested in purchasing either of these
(Shannon Sears, Linguistic Data Consortium, firstname.lastname@example.org
I hope this will be of interest to a lot of members.
Department of Linguistics and MEL
LANCASTER, LA1 4YT, UK
This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 13:26:02 MET DST