Thomas Schiex, permanent researcher

INRA Toulouse,
Dept. de Mathématique et Informatique appliquées.
Chemin de Borde Rouge - Auzeville, CS 52627
31326 Castanet Tolosan Cedex, France

e-mail:Thomas.Schiex@nospam@toulouse.inra.fr
Phone: (+33) 5.61.28.54.28
Fax: (+33) 5.61.28.53.35

Sections

Research themes

  • Artificial intelligence: mainly focused on the constraint satisfaction problem (CSP), a NP-hard problem defined in the AI community in the seventies and more specifically weighted and valued constraint networks, I'm more largely interested in all sorts of graphical models (Bayesian nets, SAT, QBF, MDP, POMDP, Stochastic and mixed CSP, influence diagrams, CP-nets...)

  • Bioinformatics: the application of CSP and other techniques originating from artificial intelligence and operations research to constrained optimisation problems, more specifically in computational biology. Actually, this is mainly genetic markers ordering, genetic map joining, RNA secondary structure prediction and also RNA/protein gene finding and prediction (with frameshift detection) both for prokaryotic and eukaryotic organisms, biological network inference and protein redesign.

External collaborations

  • With the Combinatorial optimization group of CERT (Centre d'Études et de Recherches de Toulouse de l'ONERA) on algorithms for CSP.
  • With Hélène Fargier, Jérôme Lang, Martin Cooper from the Institut de Recherches en Informatique de Toulouse for extensions of the CSP formalism and valued constraint network properties and algorithms.
  • With the team of F. Rossi, University of Padova, Italy, for the comparison of Valued CSP and Semi-ring CSP.
  • With Pedro Meseguer and Javier Larrosa (Polytechnic University of Catalunya, Spain), for the algorithmic of Valued and weighted CSP.
  • With P. Rouzé (university of Ghent, Belgium) for gene finding in Arabidopsis thaliana. and other plants.
  • With Marie-France Sagot on the algorithmic of gene finding.

Tools & Softwares

Some pieces of software in which I have been involved to some extent.
  • HELP: a lazy (but efficient) interpretor for Lisp on the MacIntosh, written during my PhD.
  • Con'FLEX: a C++ library dedicated to the expression and the resolution of (fuzzy) CSP.
  • LVCSP: a CommonLisp library of algorithms to solve general valued CSP.
  • Choco: a previously Claire library, now ported to Java, aimed at solving finite domain CSP, with an emphasis towards teaching and research.
  • ToolBar: an efficient research oriented C library that handles non idempotent valued CSP with finite domains (and Bayesian networks) and solves optimization queries on it using recently developed local consistency techniques (see for example this paper). This library is developed in tight collaboration with S. de Givry (INRA), J. Larrosa (UPC, Barcelona), F. Heras (UPC, Barcelona) and Emma Rollon (UPC, Barcelona).
  • MendelSoft: is an an open source software which detects marker genotyping incompatibilities (Mendelian errors only) in complex pedigrees using weighted constraint satisfaction techniques. This software is directly derived from the weighted CSP solver toulbar2.
  • CARThAGENE: solves marker ordering problems and can also join maps (genetic or radiated hybrid) on various pedigrees: backcross, RI (ri self, ri sib), F2 intercross, phase known outbreds, haploid and diploid radiation hybrids up to now. CarthaGene an Open Source software, with efficient algorithms inside and a nice graphical interface. It runs under Unix, Linux and Windows.
  • MilPat: a constraint based efficient and powerful structured motif search engine for genomic DNA. It allows to look for new members of known RNA gene families, taking into account possible interactions with other nucleotidic sequences. Developped by P. Thébault during her PhD-thesis.
  • DARN!: a weighted constraint based efficient and powerful structured motif search engine for genomic DNA which supersedes MilPat. It allows to look for new members of known RNA gene families, taking into account possible interactions with other nucleotidic sequences. Developped byMatthias Zytnicki during his PhD-thesis. DARN! can also find RNA genes from an alignment.
  • FrameD: predicts genes and sequencing errors in procaryotic organisms. This software has been used, among other things, to predict genes in two GC-rich organisms: Sinorhizobium meliloti and Ralstonia solanacearum. You can use FrameD web site or get binaries from the same site.
  • EuGene: tries to locate genes (introns/exons) in eucaryotic sequences. Simultaneously exploits Markov models, existing signal evaluation/detection pieces of software (splice sites, ATG), EST and EST/proteins similarities. EuGène has been used for the annotation of several complete eukaryotic (plant) genomes. Open sources and binaries for different architectures are available on EuGene home page if you are interested.
  • EuGene'Hom: tries to locate genes (introns/exons) in eucaryotic sequences. Simultaneously exploits proteic Markov models, signal evaluation/detection by probabilistic models (splice sites, ATG) and sequence conservation with homologuous sequences from several organisms. Still limited to plants.

Institut National de la Recherche Agronomique
Département de mathématique et informatique appliquées