instantiation.rst 3.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137
  1. Instantiation
  2. =============
  3. Here, we'll explain in detail what an instantiation contains and what it does.
  4. Folder structure
  5. ----------------
  6. The folder structure of an iepy instance is the following:
  7. .. code-block:: bash
  8. ├── __init__.py
  9. ├── settings.py
  10. ├── database_name_you_picked.sqlite
  11. ├── bin
  12. │   ├── csv_to_iepy.py
  13. │   ├── iepy_rules_runner.py
  14. │   ├── iepy_runner.py
  15. │   ├── manage.py
  16. │   ├── preprocess.py
  17. │   └── rules_verifier.py
  18. ├── extractor_config.json
  19. └── rules.py
  20. Let's see why each one of those files is there:
  21. Settings file
  22. .............
  23. settings.py is a configuration file where you can change the database settings and all the web interface related settings.
  24. This file has a `django settings <https://docs.djangoproject.com/en/1.7/ref/settings/>`_ file format.
  25. Database
  26. ........
  27. When you create an instance, a *sqlite* database is created by default.
  28. It has no data yet, since you'll have to fill it with your own data.
  29. When working with big datasets, it's recommended to use some database engine other than *sqlite*.
  30. To change the database engine, change the `DATABASES` section of the settings file:
  31. ::
  32. DATABASES = {
  33. 'default': {
  34. 'ENGINE': 'django.db.backends.sqlite3',
  35. 'NAME': 'database_name_you_picked.sqlite',
  36. }
  37. }
  38. For example, you can use PostgreSQL like this:
  39. ::
  40. DATABASES = {
  41. 'default': {
  42. 'ENGINE': 'django.db.backends.postgresql_psycopg2',
  43. 'NAME': 'your_database_name',
  44. }
  45. }
  46. (Remember that you'll need to install ``psycopg2`` first with a simple ``pip install psycopg2``)
  47. Take a look at the `django database configuration documentation <https://docs.djangoproject.com/en/dev/ref/settings/#databases>`_ for more detail.
  48. .. note::
  49. Each time you change your database (either the engine or the name) you will have
  50. to instruct *django* to create all the tables in it, like this:
  51. .. code-block:: bash
  52. python bin/manage.py migrate
  53. Active learning configuration
  54. .............................
  55. ``extractor_config.json`` specifies the configuration of the active learning core in *json* format.
  56. Rules definition
  57. ................
  58. If you decide to use the rule based core, you'll have to define all your rules in the file ``rules.py``
  59. You can verify if your rules run correctly using ``bin/rules_verifier.py``.
  60. Read more about it `here <rules_tutorial.html#verifying-your-rules>`__.
  61. CSV importer
  62. ............
  63. In the ``bin`` folder, you'll find a tool to import data from CSV files. This is the script ``csv_to_iepy.py``.
  64. Your CSV data has to be in the following format:
  65. ::
  66. <document_id>, <document_text>
  67. Preprocess
  68. ..........
  69. To preprocess your data, you will use the ``bin/preprocess.py`` script. Read more about it :doc:`here <preprocess>`
  70. Runners
  71. .......
  72. In the ``bin`` folder, you have scripts to run the active learning core (``iepy_runner.py``) or the
  73. rule based core (``iepy_rules_runner.py``)
  74. Web UI management
  75. .................
  76. For the web server management, you have the ``bin/manage.py`` script. This is a `django manage file <https://docs.djangoproject.com/en/1.7/ref/django-admin/>`_
  77. and with it you can start up your server.
  78. Instance Upgrade
  79. ----------------
  80. From time to time, small changes in the iepy internals will require an *upgrade* of existing iepy instances.
  81. The upgrade process will apply the needed changes to the instance-folder structure.
  82. If you made local changes, the tool will preserve a copy of your changes so you can merge the conflicting areas by hand.
  83. To upgrade an iepy instance, simply run the following command
  84. .. code-block:: bash
  85. iepy --upgrade <instance path>
  86. .. note::
  87. Look at the settings file to find the version of any iepy instance.