README.rst 3.4 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
  1. IEPY
  2. ====
  3. IEPY is an open source tool for
  4. `Information Extraction <http://en.wikipedia.org/wiki/Information_extraction>`_
  5. focused on Relation Extraction.
  6. To give an example of Relation Extraction, if we are trying to find a
  7. birth date in:
  8. `"John von Neumann (December 28, 1903 – February 8, 1957) was a Hungarian and
  9. American pure and applied mathematician, physicist, inventor and polymath."`
  10. then IEPY's task is to identify "``John von Neumann``" and
  11. "``December 28, 1903``" as the subject and object entities of the "``was born in``"
  12. relation.
  13. It's aimed at:
  14. - `users <http://iepy.readthedocs.org/en/latest/active_learning_tutorial.html>`_
  15. needing to perform Information Extraction on a large dataset.
  16. - `scientists <http://iepy.readthedocs.org/en/latest/how_to_hack.html>`_
  17. wanting to experiment with new IE algorithms.
  18. Features
  19. --------
  20. - `A corpus annotation tool <http://iepy.readthedocs.org/en/latest/corpus_labeling.html>`_
  21. with a `web-based UI <http://iepy.readthedocs.org/en/latest/corpus_labeling.html#document-based-labeling>`_
  22. - `An active learning relation extraction tool <http://iepy.readthedocs.org/en/latest/active_learning_tutorial.html>`_
  23. pre-configured with convenient defaults.
  24. - `A rule based relation extraction tool <http://iepy.readthedocs.org/en/latest/rules_tutorial.html>`_
  25. for cases where the documents are semi-structured or high precision is required.
  26. - A web-based user interface that:
  27. - Allows layman users to control some aspects of IEPY.
  28. - Allows decentralization of human input.
  29. - A shallow entity ontology with coreference resolution via `Stanford CoreNLP <http://nlp.stanford.edu/software/corenlp.shtml>`_
  30. - `An easily hack-able active learning core <http://iepy.readthedocs.org/en/latest/how_to_hack.html>`_,
  31. ideal for scientist wanting to experiment with new algorithms.
  32. Installation
  33. ------------
  34. Install the required packages:
  35. .. code-block:: bash
  36. sudo apt-get install build-essential python3-dev liblapack-dev libatlas-dev gfortran openjdk-7-jre
  37. Then simply install with **pip**:
  38. .. code-block:: bash
  39. pip install iepy
  40. Full details about the installation is available on the
  41. `Read the Docs <http://iepy.readthedocs.org/en/latest/installation.html>`__ page.
  42. Running the tests
  43. -----------------
  44. If you are contributing to the project and want to run the tests, all you have to do is:
  45. - Make sure your JAVAHOME is correctly set. `Read more about it here <http://iepy.readthedocs.io/en/latest/installation.html#install-iepy-package>`_
  46. - In the root of the project run `nosetests`
  47. Learn more
  48. ----------
  49. The full documentation is available on `Read the Docs <http://iepy.readthedocs.org/en/latest/>`__.
  50. Authors
  51. -------
  52. IEPY is © 2014 `Machinalis <http://www.machinalis.com/>`_ in collaboration
  53. with the `NLP Group at UNC-FaMAF <http://pln.famaf.unc.edu.ar/>`_. Its primary
  54. authors are:
  55. * Rafael Carrascosa <rcarrascosa@machinalis.com> (rafacarrascosa at github)
  56. * Javier Mansilla <jmansilla@machinalis.com> (jmansilla at github)
  57. * Gonzalo García Berrotarán <ggarcia@machinalis.com> (j0hn at github)
  58. * Franco M. Luque <francolq@famaf.unc.edu.ar> (francolq at github)
  59. * Daniel Moisset <dmoisset@machinalis.com> (dmoisset at github)
  60. You can follow the development of this project and report issues at
  61. http://github.com/machinalis/iepy
  62. You can join the mailing list `here <https://groups.google.com/forum/?hl=es-419#%21forum/iepy>`__