123456789101112131415161718192021222324252627282930313233343536373839404142434445 |
- Gazettes resolution
- ===================
- We call a gazette a mapping between a list of tokens and an entity kind. If that list of tokens
- matches exactly on your text, then that would be tagged as an entity.
- All the entities occurrences that where detected by a gazette and share the same set of tokens, will share the same entity.
- This means that if you have a gazette that finds ``Dr. House`` and tags it as a ``PERSON``, all the occurrences in the text
- that matches those tokens, will belong to the same entity.
- Basic usage: Loading from csv
- -----------------------------
- The basic usage would be including a set of gazettes before running the preprocess step. To include
- the gazettes on your database, you can use the script ``gazettes_loader.py`` that comes included with
- your instance. This will take a csv file with the following format:
- ::
- <literal>,<class>
- Literal can be a single token or multiple tokens separated by space.
- The only restriction is that every literal is unique.
- For example, a gazettes csv file could be:
- ::
- literal,class
- Dr. House,PERSON
- Lupus,DISEASE
- Headache,SYMPTOMS
- Removing elements
- -----------------
- When deleting an entity, all the occurrences are deleted with it along the gazette item that introduced them.
- Same goes the other way, if you delete a gazette item, the entity, and therefore the occurrences, will be deleted as well.
- To delete a gazette item, go to the database admin page and find the Gazette section. You'll be able to find the one that you want
- to remove.
- To remove an entity, find an occurrence by exploring a document on any of its views, and right click it. There you'll find a delete
- link that enables you to remove the whole entity. Keep in mind that this action will delete the gazette item.
|